This is a snapshot of Indico's old Trac site. Any information contained herein is most probably outdated. Access our new GitHub site here.

#1483 closed defect (fixed)

UTF8 issue while migrating from 0.98 to 1.1

Reported by: jdefaver Owned by:
Priority: normal Milestone: v1.2
Component: General Version: 1.1
Keywords: Cc:

Description

Hi,

i followed the instructions here:

https://indico-software.org/wiki/Releases/Indico1.1

in order to upgrade. in the migrate.py step, i got the following:

[1289/1620 79.567901%] a058 PIC meeting                                          
Migration failed! DB may be in  an inconsistent state:
Traceback (most recent call last):
  File "/opt/indico/bin/migration/migrate.py", line 897, in main
    dry_run=args.dry_run)
  File "/opt/indico/bin/migration/migrate.py", line 843, in runMigration
    task(dbi, withRBDB, prevVersion)
  File "/opt/indico/bin/migration/migrate.py", line 734, in indexConferenceTitle
    nameIdx.index(conf.getId(), conf.getTitle().decode('utf-8'))
  File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb5 in position 44: invalid start byte

Indico seems to work though, but only part of the meetings have been migated, and i have no idea of the consequences.

Could you please help me :

  • solve this issue
  • get my indico installation in a nicer state

I have a backup of the whole server from the morning before the upgrade.

Thanks,

Jerome

P.S : in the instructions for upgrade, it would be nice to add that one should use

easy_install -U indico

instead of

easy_install indico

and that the --with-rb option comes with migrate.py and not indico_initial_setup

Change History (21)

comment:1 Changed 20 months ago by jbenito

Hi Jerome,

Do you have a backup of the db? If so, I would recommend you to start the migration from scratch. It is safer.
But, before running the migrate.py on the backup copy, you need to fix your DB. It looks that your indexes have data that is not in UTF-8. In order to do so, connect through indico_shell and:

from indico.util.string import fix_broken_string
conf_list = ConferenceHolder().getValuesToList()
for conf in conf_list:
     conf.setTitle(fix_broken_string(conf.getTitle()))
dbi.commit()

comment:2 Changed 20 months ago by jbenito

  • Status changed from new to infoneeded_new

comment:3 Changed 20 months ago by jbenito

  • Milestone set to v1.1
  • Status changed from infoneeded_new to new

comment:4 Changed 20 months ago by jbenito

  • Milestone changed from v1.1 to v1.2

comment:5 Changed 20 months ago by jdefaver

Hi,

thanks for your replies. I used my backup to revert to 0.98 completely. I tried suggested lines but i got:

>>> from indico.util.string import fix_broken_string
Traceback (most recent call last):
  File "<console>", line 1, in <module>
ImportError: cannot import name fix_broken_string

I can import indico.util.string but then:

>>> import indico.util.string as indicostring
>>> dir(indicostring)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'html_line_breaks', 'remove_accents', 'remove_non_alpha', 'unicodeOrNone', 'unicodedata']

is fix_broken_string supposed to be defined in 0.98 ?

comment:6 Changed 20 months ago by jbenito

Did you revert only your DB to 0.98 or also Indico's version?
You should have a DB from your 0.98 Indico version, but you should have Indico v1.1 installed.

comment:7 Changed 20 months ago by jdefaver

I also reverted the whole indico installation, sorry for the lack of details.

comment:8 Changed 20 months ago by jbenito

So, please, install v1.1.2, and then apply the script I sent to you. Finally, re-run migrate.py

comment:9 Changed 20 months ago by jdefaver

Hi,

I did as you suggested but still no success :

# indico_shell 
+ 'MaKaC' : MaKaC base package
+ 'Conference'
+ 'Category'
+ 'ConferenceHolder'
+ 'CategoryManager'
+ 'AvatarHolder'
+ 'GroupHolder'
+ 'HelperMaKaCInfo'
+ 'PluginsHolder'
+ 'Catalog'
+ 'IndexesHolder'
+ 'minfo' : MaKaCInfo instance

indico 1.1.2

>>> from indico.util.string import fix_broken_string
Traceback (most recent call last):
  File "<console>", line 1, in <module>
ImportError: cannot import name fix_broken_string
>>> import indico.util.string as istring
>>> dir(istring)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'html_line_breaks', 'permissive_format', 're', 'remove_accents', 'remove_extra_spaces', 'remove_non_alpha', 'remove_tags', 'truncate', 'unicodeOrNone', 'unicodedata'] 

comment:10 Changed 20 months ago by jbenito

Um...sorry, it looks that we only added that method in 1.2. Anyway, you can do add the method directly in indico_shell, just before the migration code I have pasted in my first comment. Copy and paste this:

def fix_broken_string(text):
    try:
        text = text.decode('utf-8')
    except UnicodeDecodeError:
        try:
            text = text.decode('latin1')
        except UnicodeDecodeError:
            text = unicode(text, 'utf-8', errors='replace')
    return text.encode('utf-8')

comment:11 Changed 20 months ago by jdefaver

Hi,

It seems to work but the migration has been running for ~1 hours now (i started as soon as i got your answer). Is it expected ?

comment:12 Changed 20 months ago by jbenito

No feedback from the migration script? I fear it is failing to connect to the DB. Are you sure the connection to the DB took place?

comment:13 Changed 20 months ago by jdefaver

Is there a way to find out ?

# python /opt/indico/bin/migration/migrate.py --prev-version=0.98.2 --with-rb

This script will migrate your Indico DB to a new version. We recommend that
this operation be executed while the web server is down, in order to avoid
concurrency problems and DB conflicts.


Are you sure you want to execute the migration now? [y/N] y

Executing migration...

comment:14 Changed 20 months ago by jdefaver

I stopped it and indeed it looks like no connection took place:

^C
Migration failed! DB may be in  an inconsistent state:
Traceback (most recent call last):
  File "/opt/indico/bin/migration/migrate.py", line 897, in main
    dry_run=args.dry_run)
  File "/opt/indico/bin/migration/migrate.py", line 809, in runMigration
    dbi = DBMgr.getInstance()
  File "/usr/lib/python2.6/site-packages/indico-1.1.2-py2.6.egg/MaKaC/common/db.py", line 96, in getInstance
    cls._instance=DBMgr(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/indico-1.1.2-py2.6.egg/MaKaC/common/db.py", line 87, in __init__
    max_disconnect_poll=max_disconnect_poll)
  File "/usr/lib/python2.6/site-packages/ZODB3-3.10.5-py2.6-linux-x86_64.egg/ZEO/ClientStorage.py", line 420, in __init__
    self._wait(wait_timeout)
  File "/usr/lib/python2.6/site-packages/ZODB3-3.10.5-py2.6-linux-x86_64.egg/ZEO/ClientStorage.py", line 437, in _wait
    self._rpc_mgr.connect(sync=1)
  File "/usr/lib/python2.6/site-packages/ZODB3-3.10.5-py2.6-linux-x86_64.egg/ZEO/zrpc/client.py", line 280, in connect
    self.cond.wait(self.sync_wait)
  File "/usr/lib64/python2.6/threading.py", line 258, in wait
    _sleep(delay)
KeyboardInterrupt

comment:15 Changed 20 months ago by jbenito

Yes, it never took place. On the other hand, you can connect using indico_shell, right?

comment:16 Changed 20 months ago by jbenito

I mean, if you run indico_shell and you see the python prompt, it means that the connection to the DB is done.

comment:17 Changed 20 months ago by jdefaver

ah ok, my mistake, i had stopped the httpd server but also the whole indico in the process, which obviously prevented access to the db.

Now the migration runs but fails with:

# Indexing Conference Title (1.1)

Migration failed! DB may be in  an inconsistent state:
Traceback (most recent call last):
  File "/opt/indico/bin/migration/migrate.py", line 897, in main
    dry_run=args.dry_run)
  File "/opt/indico/bin/migration/migrate.py", line 843, in runMigration
    task(dbi, withRBDB, prevVersion)
  File "/opt/indico/bin/migration/migrate.py", line 734, in indexConferenceTitle
    nameIdx.index(conf.getId(), conf.getTitle().decode('utf-8'))
  File "/usr/lib/python2.6/site-packages/indico-1.1.2-py2.6.egg/MaKaC/common/indexes.py", line 1232, in index
    intId = self.addString(entryId)
  File "/usr/lib/python2.6/site-packages/indico-1.1.2-py2.6.egg/MaKaC/common/indexes.py", line 1171, in addString
    raise KeyError("Key '%s' already exists in index!" % stringId)
KeyError: "Key '0' already exists in index!"

comment:18 Changed 20 months ago by jbenito

Um...that's really weird. It looks like if you have a conference 2 times in the index.
Would you have some time today to connect to our chatroom? I think we could try to solve all your troubles asap and faster through our chatroom. Would that be possible?

Details on how to connect to jabber/xmpp: http://indico-software.org/wiki/Community#Jabber

Cheers,
Jose

comment:19 Changed 20 months ago by jdefaver

I am available to chat for the next 3 hours, i'll try to connect to the chatroom

comment:20 Changed 20 months ago by jdefaver

I'm in the indico room, ready when you are

comment:21 Changed 20 months ago by jbenito

  • Resolution set to fixed
  • Status changed from new to closed

I am closing this ticket since the migration is done.

Note: See TracTickets for help on using tickets.