Extracting ChemDraw schemes as .cdx files from MS Word/Excel/PowerPoint documents

I couldn’t find a free software that would easily do exactly what I want (see the subject). So here’s my ghetto solution, which can be easily automated.

  1. Change the extension *.docx/.xlsx/.pptx to *.zip
  2. Unpack *.zip file
  3. In the unpacked folder go to ‘word/embeddings’ or ‘xl/embeddings’ or ‘ppt/embeddings’
  4. Rename ‘oleObject1.bin’ to ‘almostThere.rar’
  5. Unpack ‘almostThere.rar’ (google for the software)
  6. Go to the folder ‘almostThere/’
  7. Rename file ‘CONTENTS’ to ‘bingo.cdx’
  8. Double-click and enjoy

As a free gift from Microsoft you’ll get an .emf image of all embedded ChemDraw schemes in the ‘*/word/media’ folder.

Note 1. Definitely works under Windows 10 with MS Office 2013 documents.

Note 2. Apparently one needs administrator rights on Windows 10 to change file’s extension

Note 3. Doesn’t work with structures added to Excel spreadsheet via ChemBioOffice plugin. Only .emf images are extractable.

Note 4. Backup the data before you mess something up.

Note 5. In fact the last .rar file is not a true RAR archive but decompressing software (e.g. 7zip) would still open it. It can as well be renamed to .tar, .arj, .7z or .cab (but not .zip)

Note 6. Under linux only 7zip could reliably extract files from the oleObject archive, regardless of its extension.



Author: Slava Bernat

I did my PhD in medicinal chemistry/chemical biology of G protein-coupled receptors and then explored some chemical biology of non-coding RNA as a postdoc. Currently I'm working in a small biotech company in San-Francisco Bay area as a research chemist. I'm writing about science, which catches my attention in rss feed reader and some random thoughts or tutorials.

6 thoughts on “Extracting ChemDraw schemes as .cdx files from MS Word/Excel/PowerPoint documents”

  1. Hey, I’ve used it! I’ve got 42 objects and after step 2 made a script:

    for ((i=1; i<=42; i++))
    mkdir $i
    7z x oleObject$i.rar -o$i
    cd $i
    mv CONTENTS $i.cdx
    cd ..


  2. Hi, I generated a chemdraw from a Mac, copied and pasted the scheme into Word and saved it. I tried to follow your instruction on Window 10 with Word 2016 but I can’t find the word/embedding folder. All I saw were: “_rels”, “docProps”, and “word” folders. Would you have any suggestion for me? Thank you

    1. Can you still open the scheme in ChemDraw by double-clicking it on Windows? It might be a general Mac-Windows compatibility issue of ChemDraw. I don’t have a Mac with ChemDraw installed to test but I’ve heard about this issue many times before.

      1. Hi, no, double clicking on the scheme won’t work if the scheme was generated on a Mac. perhaps that is the main issue. I don’t have a window machine that have chemdraw, the opposite of your issue. Anyway, thank you so much though!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s