Categories

Comparing Microsoft Word documents stored in a Subversion repository

Update (2005/07/26): the latest version of TortoiseSVN comes with VBScripts for Word and OpenOffice documents comparison.

TortoiseSVN has a nifty feature which allows you to specify a custom diff program, based on the extension of the file which has to be diffed. This gave me the idea of this small Python script, which launches Word in document comparison mode:

# Use the pywin32 extension and Microsoft word to compare two Word documents
# Use this script with TortoiseSVN and extension-specific diff programs
# with a command line like this :
# c:\python24\pythonw.exe c:\path\to\word_diff.py %base %mine
# $Id: word_diff.py 1355 2005-06-30 11:16:59Z nlehuen $

import win32com.client
import sys
import os

try:
    # Get the absolute paths for the arguments
    # Word requires absolute paths.
    p1 = os.path.abspath(sys.argv[1])
    p2 = os.path.abspath(sys.argv[2])
    print "Comparing %s to %s..."%(p1,p2)

    # Open Word
    word = win32com.client.Dispatch("Word.Application")

    # Open the first document
    destination = word.Documents.Open(p2)
    # Hide it
    destination.Windows[0].Visible=0

    # Compare to the second document
    compare = destination.Compare(p1)

    # Show the comparison result
    word.ActiveDocument.Windows[0].Visible = 1

    # Mark the comparison document as saved to prevent the annoying
    # "Save as" dialog from appearing.
    word.ActiveDocument.Saved = 1

    # Close the first document
    destination.Close()
except:
    # In case of an exception, display it and wait for user input.
    import traceback
    traceback.print_exc()
    raw_input()

Very simple, but the precise sequence of COM calls to perform (showing which document, closing which document, should we open the first and compare to the second or conversely, etc.) needed a few minutes of work. This may be more naturally implemented in VBScript on top of the Windows Scripting Host, but I prefer coding in Python.

  • Michael Chermside

    Very useful tool! Would you be willing to give explicit permission for me (and my company) to use this?

  • Andrew Durdin

    Very nice. However, if Word is not currently running when this script runs, then the resulting application will not be shown; you should add
    word.Visible = 1
    immediately after opening the Word.Application object (if you add it later, then any exception would prevent Word from being shown).
    You could also optionally add word.Activate() if you wanted Word brought to the foreground…

  • http://http://minkirri.apana.org.au/~abo/ Donovan Baarda

    I vaugely remember looking at the stuff you can do with OpenOffice, and believe that you can use OOwriter as not just a diff tool, but a three-way merge tool. This would allow you to do branch/merge stuff with documents.

  • shad

    Please help, I do very need a tool that can compare two word documents in svn, but the code above produces only an error:

    Comparing c:\1.doc to c:\2.doc…
    Traceback (most recent call last):
    File "C:\Program Files\TortoiseSVN\doc-compare.py", line 29, in ?
    compare = destination.Compare(p1)
    File "<COMObject Open>", line 5, in Compare
    com_error: (-2147352567, ‘\xce\xf8\xe8\xe1\xea\xe0.’, (0, ‘Microsoft Word’, ‘\xce\xf8\xe8\xe1\xea\xe0 \xea\xee\xec\xe0\xed\xe4\xfb’, ‘C:\\Program Files\\Microsoft Office\\OFFICE11\\1049\\wdmain11.chm’, 36966, -2146824090), None)

    The hex text doesn’t contain any usefull info – just ‘Error’ and ‘Call failed’. MS Office v.11 (2003)

  • Phipps

    It sounds to be a really great tool to compare two word documents! Is there any screen shot of comparision result or installation tutorial?

    Maybe you can create an open source project for this great idea and great implementation.

  • http://http://dkiroku.com dota-don

    xdocdiff – diff for Word, Excel, PowerPoint, pdf files with TortoiseSVN freemind.s57.xrea.com/xdo…

  • Octavi

    I wrote a macro for OpenOffice:

    http://www.oooforum.org/forum/vi...

    It works fine from command line if OpenOffice is already open (or quickstart). If someone knows a better way to do it write it there!

    Octavi

  • Donovan Baarda

    I vaugely remember looking at the stuff you can do with OpenOffice, and believe that you can use OOwriter as not just a diff tool, but a three-way merge tool. This would allow you to do branch/merge stuff with documents.