Extracting ChemDraw schemes as .cdx files from MS Word/Excel/PowerPoint documents

I couldn’t find a free software that would easily do exactly what I want (see the subject). So here’s my ghetto solution, which can be easily automated.

  1. Change the extension *.docx/.xlsx/.pptx to *.zip
  2. Unpack *.zip file
  3. In the unpacked folder go to ‘word/embeddings’ or ‘xl/embeddings’ or ‘ppt/embeddings’
  4. Rename ‘oleObject1.bin’ to ‘almostThere.rar’
  5. Unpack ‘almostThere.rar’ (google for the software)
  6. Go to the folder ‘almostThere/’
  7. Rename file ‘CONTENTS’ to ‘bingo.cdx’
  8. Double-click and enjoy

As a free gift from Microsoft you’ll get an .emf image of all embedded ChemDraw schemes in the ‘*/word/media’ folder.

Note 1. Definitely works under Windows 10 with MS Office 2013 documents.

Note 2. Apparently one needs administrator rights on Windows 10 to change file’s extension

Note 3. Doesn’t work with structures added to Excel spreadsheet via ChemBioOffice plugin. Only .emf images are extractable.

Note 4. Backup the data before you mess something up.

Note 5. In fact the last .rar file is not a true RAR archive but decompressing software (e.g. 7zip) would still open it. It can as well be renamed to .tar, .arj, .7z or .cab (but not .zip)

Note 6. Under linux only 7zip could reliably extract files from the oleObject archive, regardless of its extension.

 

Advertisements

Author: Slava Bernat

I did my PhD in medicinal chemistry/chemical biology of G protein-coupled receptors and then explored some chemical biology of non-coding RNA as a postdoc. Currently I'm working in a small biotech company in San-Francisco Bay area as a research chemist. I'm writing about science, which catches my attention in rss feed reader and some random thoughts or tutorials.

3 thoughts on “Extracting ChemDraw schemes as .cdx files from MS Word/Excel/PowerPoint documents”

  1. Hey, I’ve used it! I’ve got 42 objects and after step 2 made a script:

    for ((i=1; i<=42; i++))
    do
    mkdir $i
    7z x oleObject$i.rar -o$i
    cd $i
    mv CONTENTS $i.cdx
    cd ..
    done

    Thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s