Compare the names of the files in two directories in C#
This example compares the names of the files in two directories. When you click the Compare button, the following code executes.
It starts by using Directories.GetFiles to get the names of the files in the first directory. It removes the directory name form the file names and uses Array.Sort to sort the names.
The code repeats these steps to get the names of the files in the second directory and sort them.
Next the code loops through the arrays comparing their entries. If two entries match, the code adds them both to the program's DataGridView control. If the files don't match, the program adds the one that comes alphabetically first to the DataGridView and increments its array's counter.
When the code finishes with all of the files in one of the arrays, it dumps the remaining items into the DataGridView.
// Compare the files in each directory.
private void btnCompare_Click(object sender, EventArgs e)
{
// Get sorted lists of files in the directories.
string dir1 = txtDir1.Text;
if (!dir1.EndsWith("\\")) dir1 += "\\";
string[] file_names1 = Directory.GetFiles(dir1);
for (int i = 0; i < file_names1.Length; i++)
{
file_names1[i] = file_names1[i].Replace(dir1, "");
}
Array.Sort(file_names1);
string dir2 = txtDir2.Text;
if (!dir2.EndsWith("\\")) dir2 += "\\";
string[] file_names2 = Directory.GetFiles(dir2);
for (int i = 0; i < file_names2.Length; i++)
{
file_names2[i] = file_names2[i].Replace(dir2, "");
}
Array.Sort(file_names2);
// Compare.
int i1 = 0, i2 = 0;
while ((i1 < file_names1.Length) && (i2 < file_names2.Length))
{
if (file_names1[i1] == file_names2[i2])
{
// They match. Display them both.
dgvFiles.Rows.Add(new Object[] {file_names1[i1], file_names2[i2]});
i1++;
i2++;
}
else if (file_names1[i1].CompareTo(file_names2[i2]) < 0)
{
// Display the directory 1 file.
dgvFiles.Rows.Add(new Object[] {file_names1[i1], null});
i1++;
}
else
{
// Display the directory 2 file.
dgvFiles.Rows.Add(new Object[] {null, file_names2[i2]});
i2++;
}
}
// Display remaining directory 1 files.
for (int i = i1; i< file_names1.Length; i++)
{
dgvFiles.Rows.Add(new Object[] {file_names1[i], null});
}
// Display remaining directory 2 files.
for (int i = i2; i< file_names2.Length; i++)
{
dgvFiles.Rows.Add(new Object[] {null, file_names2[i]});
}
}



Good article!But in my mind use LINQ is much easyer and much readable.
//files just in directory 1
var fileJustInDir1 =from f1 in file_names1
where !(from f2 in file_names2
select f2).Contains (f1)
select f1;
foreach (var f in fileJustInDir1 )
{
dgvFiles.Rows.Add(new object[] {f,null });
}
Reply to this
I think as it is written this doesn't quite work. I suspect it's because the file names include the complete path. Since we're comparing two different directories, they will always have different paths. Directory.GetFiles only returns file titles (names without the path).
But I agree that something like this might be simpler to code in LINQ. You'd probably need to select both the file's title and path. Or perhaps title and a FileInfo object.
I also wonder how smart LINQ is about this sort of query. It would slow things down, for example, if it selected all files in directory 2 for each file in directory 1.
Perhaps if I get the time and can figure it out, I'll write a new example using LINQ and see how they compare on big directories. (If you write such an example, let me know and I'll post it.)
Reply to this