Home The News DupeCompress
DupeCompress PDF Print E-mail
Written by Administrator   
Sunday, 06 December 2009 15:03

DupeCompress is an OpenSource application written in Visual Basic 2008 and targeted for the .NET 2.0 platform.  It is a very simple program, the purpose of which is to analyze all of the files and folders in a directory and locate any duplicates.  Once the files are analyzed, the unique files are copied to a separate directory without any duplicates.  A stub exe, called REBUILD, is copied to the folder along with a database containing the information about the files and linking the duplicates to the unique files.  This directory can then be compressed and backed up or shared with others.


The motivation for creating this program was that I had several different folders and files which contained essentially the same files in different arrangements.  This was an engineering test, of which I had to transmit all the different scenarios to a software engineer for analysis and troubleshooting.  The BEST compression I could get using WinRAR was about 75% of the original size.  The Original Size was almost 1GB, the compressed size was about 750MB.  Too large to transfer over the internet comfortably.  Knowing that all of the files were essentially the same, I set out to create this program that would find the duplicates based on content, store the original filenames and folder structure, then remove the duplicates.  I could then Rar the files into an SFX archive, which would unpack the files to the TEMP folder on the recipient's computer and run the REBUILD program to rebuild the original files and folders.

I realize that this program may be of limitted use to most people, but I decided to share it for those who wish to take the idea and expand it, or those who just wish to learn how something like this can be done.  It's not terribly complicated, I didn't spend much time on it, it doesn't have proper error correction - I simply wanted to get this done as quickly as possible to accomplish my one task.  I floated the idea of being able to add and remove individual files (the Class Module supports this - although the functionality to remove files from the database is not complete) and to be able to compress the resulting unique files automatically.  I was going to use DotNetZip for this purpose, since it supports creating self extracting archives that can run a program after decompression - but these features were going to take too long to implement.

In its current state, the main program will accept a source folder and a target folder.  All subfolders and files will be analyzed for unique files, and duplicates will be removed.  A REBUILD.SOP and REBUILD.EXE file will be added to the unique files in the target folder.

 

DupeCompress Main Window


Rebuild.exe can be run, which will prompt the user for a target folder to recreate the original files.  A progress bar will be displayed to show the status of the rebuild.  Even though the Rebuild program is the simpler of the two, more attention was spent on the interface and error handling for obvious reasons.  A command line parameter "/debug" can be added to the rebuild.exe program, which will generate a file called Rebuild.log.txt in the user's My Documents folder.

 

Client interface to rebuild the original data


The software is Copyright Integral Data Systems, LLC.  You are free to use it however you wish.  However, you must provide a reference back to this website if you use any portion of it in your program.

 
Download the compiled EXEs here:  http://www.idatasys.net/pubfiles/dupecompress_1_0.zip

Download the source code here: http://www.idatasys.net/pubfiles/dupecompress_src_1_0.zip

Last Updated on Sunday, 06 December 2009 15:47
 
Banner
Copyright © 2010 IDS. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.