Automatically convert all of the .doc files in a directory to .docx files in C#

Whenever I double-click a .doc file on my system running Vista and Microsoft Word 2007, I get this error message:
Error

The operating system is not presently configured to run this application.
Every time this happens I have to wonder why Microsoft decided to make Word not understand .doc files by default. Perhaps it was to "encourage" people to move to .docx files.

This program converts all of the .doc files in a directory into .docx files. This program includes a reference to Microsoft.Office.Interop.Word 12.0.0.0 to allow it to manipulate Microsoft Word files. To add such a reference, open the Project menu and select Add Reference. On the .NET tab, select the Microsoft.Office.Interop.Word entry and click OK.

To make using the classes defined in that library easier to use, the program includes the following Imports statement:

using Word = Microsoft.Office.Interop.Word;

This allows the program to use the prefix Word to indicate classes in the library.

If you enter or select a directory path and click the Convert button, the following code executes.

// Convert the files in the directory.
private void btnConvert_Click(object sender, EventArgs e)
{
    // Open the Word server.
    Word._Application word_app = new Word.ApplicationClass();

    // Make a couple of objects used inside and outside of the loop.
    object missing = System.Reflection.Missing.Value;
    object save_changes = false;
    
    // Loop through the files.
    int num_converted = 0;
    lstFiles.Items.Clear();
    DirectoryInfo dir_info = new DirectoryInfo(txtDirectory.Text);
    foreach (FileInfo file_info in dir_info.GetFiles("*.doc"))
    {
        // Skip .docx files.
        if (file_info.Extension.ToLower() == ".docx") continue;

        // Get the converted file's name.
        int name_length = file_info.FullName.Length - file_info.Extension.Length;
        string new_filename = file_info.FullName.Substring(0, name_length) + ".docx";

        // See if this file has already been converted.
        if (File.Exists(new_filename))
        {
            lstFiles.Items.Add("Skipped " + file_info.Name);
        }
        else
        {
            lstFiles.Items.Add("Converted " + file_info.Name);
            num_converted++;

            // Open the file.
            object filename = file_info.FullName;
            object confirm_conversions = false;
            object read_only = true;
            object add_to_recent_files = false;
            object format = 0;
            Word._Document word_doc =
                word_app.Documents.Open(ref filename, ref confirm_conversions,
                    ref read_only, ref add_to_recent_files,
                    ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref format, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing);

            // Save as a .docx file.
            filename = new_filename;
            object file_format = Word.WdSaveFormat.wdFormatDocumentDefault;
            word_doc.SaveAs(ref filename, ref file_format, ref missing, ref missing,
                ref missing, ref missing, ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing, ref missing, ref missing,
                ref missing, ref missing);

            // Close the document without prompting.
            word_doc.Close(ref save_changes, ref missing, ref missing);
        }
    }

    // Close the word application.
    word_app.Quit(ref save_changes, ref missing, ref missing);

    int num_skipped = lstFiles.Items.Count - num_converted;
    MessageBox.Show("Converted " + num_converted.ToString() + " files.\n" +
        "Skipped " + num_skipped.ToString() + " files.",
        "Done", MessageBoxButtons.OK, MessageBoxIcon.Information);
}

This code creates a Word application object to work with Word. It then defines a couple of variables that it can pass to Word methods. Those methods take almost every parameter by reference (with the ref keyword) so the values must be stored in variables so you can pass them by reference. Probably the most important and confusing value is System.Reflection.Missing.Value. If you want to omit a parameter to one of the Word methods, you cannot simply pass the method null. Instead you must pass this special value by reference.

Having created the two values, the code creates a DirectoryInfo object representing the directory you entered and uses its GetFiles method to enumerate the files matching the pattern *.doc in the directory.

If a file's extension is .docx, the program uses a continue statement to skip the file and continue the loop.

The code then calculates the file's name with a .docx extension. If that file already exists, the program adds a statement saying it skipped the file to the ListBox on the form. If the .docx file does not exist, the program adds a message saying it converted the file and then starts doing the real work.

The code uses the Word application's Documents collection's Open method to open the .doc file. (Notice all of the Missing parameters passed with the ref keyword.) The code then simply calls the document's SaveAs method to make it resave the document in the default format, which is the .docx format, and closes the document.

After it has processed all of the files, the program closes the Word application server and displays a message telling how many files it converted and skipped.

This program does not delete the old .doc files just to be safe, although you could easily add that feature. (I would probably put a CheckBox on the form so the user can indicate whether the program should delete the files.)

   

 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments
  • No comments exist for this post.
Leave a comment

Submitted comments are subject to moderation before being displayed.

 Name

 Email (will not be published)

 Website

Your comment is 0 characters limited to 3000 characters.