Code:
D:\_KAZE_new-stuff\Dummy_Check_package_r2>dir/s
Volume in drive D is H320_Vol5
Volume Serial Number is 0CB3-C881
Directory of D:\_KAZE_new-stuff\Dummy_Check_package_r2
03/03/2011 12:03 AM <DIR> .
03/03/2011 12:03 AM <DIR> ..
03/03/2011 12:03 AM 259 Dummy_Check.bat
03/03/2011 12:03 AM 4,024,155 english.dic_351116_wordlist
03/03/2011 12:03 AM 94,208 Leprechaun_r13++++++_Microsoft_16.00.30319.01.exe
03/03/2011 12:03 AM 66,048 Overlapper-Blender_r1+.exe
03/02/2011 11:48 PM <DIR> TREE_of_TXT_files_to_be_processed
03/03/2011 12:03 AM 34,606 Yoshi_r6.exe
5 File(s) 4,219,276 bytes
Directory of D:\_KAZE_new-stuff\Dummy_Check_package_r2\TREE_of_TXT_files_to_be_processed
03/02/2011 11:48 PM <DIR> .
03/02/2011 11:48 PM <DIR> ..
03/03/2011 12:03 AM 15,816 oed2_hist.txt
03/03/2011 12:03 AM 17,128 oed2_hist10.txt
03/03/2011 12:03 AM 8,475 oed2_hist11.txt
03/03/2011 12:03 AM 13,394 oed2_hist12.txt
03/03/2011 12:03 AM 11,942 oed2_hist13.txt
03/03/2011 12:03 AM 12,366 oed2_hist2.txt
03/03/2011 12:03 AM 11,197 oed2_hist3.txt
03/03/2011 12:03 AM 9,752 oed2_hist4.txt
03/03/2011 12:03 AM 12,589 oed2_hist5.txt
03/03/2011 12:03 AM 11,206 oed2_hist6.txt
03/03/2011 12:03 AM 15,374 oed2_hist7.txt
03/03/2011 12:03 AM 15,962 oed2_hist8.txt
03/03/2011 12:03 AM 12,009 oed2_hist9.txt
13 File(s) 167,210 bytes
Total Files Listed:
18 File(s) 4,386,486 bytes
5 Dir(s) 1,004,392,448 bytes free
D:\_KAZE_new-stuff\Dummy_Check_package_r2>type Dummy_Check.bat
cd TREE_of_TXT_files_to_be_processed
..\Yoshi_r6.exe -f -o..\Dummy_Check.lst *.txt
cd..
Leprechaun_r13++++++_Microsoft_16.00.30319.01.exe Dummy_Check.lst Dummy_Check.lst.wrd 3000
Overlapper-Blender_r1+.exe Dummy_Check.lst.wrd english.dic_351116_wordlist
D:\_KAZE_new-stuff\Dummy_Check_package_r2>Dummy_Check.bat
D:\_KAZE_new-stuff\Dummy_Check_package_r2>cd TREE_of_TXT_files_to_be_processed
D:\_KAZE_new-stuff\Dummy_Check_package_r2\TREE_of_TXT_files_to_be_processed>..\Yoshi_r6.exe -f -o..\Dummy_Check.lst *.txt
Yoshi(Filelist Creator), revision 06, written by Svalqyatchx,
in fact based on SWEEP.C from 'Open Watcom Project', thanks-thanks.
Note1: So far, it works for current directory only.
Note2: Default method is depth-first traversal;
may use pipe 'Yoshi|sort' for breadth-first_like traversal results.
Note3: Make notice that '*.*'(extensionfull only) is not equal to '*'(all);
one disadvantage is an inability to list only extensionless filenames.
Note4: Search is case-insensitive as-must.
Note5: This revision allows multiple '*', and meaning of masks is:
'?' - any character AND NOT EMPTY(default, for OR EMPTY see option -e);
'*' - any character(s) or empty.
Note6: What is a .LBL(LineByLine) file?
it is a bunch of GRAMMATICAL lines not mere LF or CRLF lines;
it contains not symbols under 32(except CR and LF) and above 127;
it contains not space symbol sequences.
Usage:
Yoshi [option(s)] [filename(s)]
option(s):
-v i.e. verbose mode; output goes to console;
-f i.e. fullpath mode for output;
-e i.e. treat '?' as any character OR EMPTY;
-t i.e. touch all encountered files;
-2 i.e. convert all encountered .TXT files to .LBL files;
-o<filename> i.e. output goes to file(in append mode).
filename(s):
Wildcards '*' and wildcards '?' are allowed i.e. "str*.c??";
default filename is '*'; DO NOT FORGET TO PUT
filename(s) WITH WILDCARD(S) INTO QUOTE MARKS!
Examples:
Yoshi -v -f -oCaterpillar_NON.lst "*.lbl" "*.txt" "*.htm" "*.html"
Yoshi -f -oMyEbooks.txt "*wiley*essential*.pdf" "*russian*.*htm"
Yoshi: Total size of files: 00,000,000,167,210 bytes.
Yoshi: Total files: 000,000,000,013.
Yoshi: Total folders: 0,000,000,000.
D:\_KAZE_new-stuff\Dummy_Check_package_r2\TREE_of_TXT_files_to_be_processed>cd..
D:\_KAZE_new-stuff\Dummy_Check_package_r2>Leprechaun_r13++++++_Microsoft_16.00.30319.01.exe Dummy_Check.lst Dummy_Check.lst.wrd 3000
Leprechaun(Fast Greedy Word-Ripper), revision 13++++++, written by Svalqyatchx.
Leprechaun: 'Oh, well, didn't you hear? Bigger is good, but jumbo is dear.'
Kaze: Let's see what a 3-way hash + 6,602,752 Binary-Search-Trees can give us,
also the performance of a 3-way hash + 6,602,752 B-Trees of order 3.
Size of input file with files for Leprechauning: 1550
Allocating memory 1170MB ... OK
Size of Input TEXTual file: 15,816
|; Word count: 2,572 of them 790 distinct; Done: 64/64
Size of Input TEXTual file: 17,128
/; Word count: 5,290 of them 1,359 distinct; Done: 64/64
Size of Input TEXTual file: 8,475
-; Word count: 6,618 of them 1,570 distinct; Done: 64/64
Size of Input TEXTual file: 13,394
\; Word count: 8,930 of them 2,014 distinct; Done: 64/64
Size of Input TEXTual file: 11,942
|; Word count: 11,035 of them 2,493 distinct; Done: 64/64
Size of Input TEXTual file: 12,366
/; Word count: 13,117 of them 2,714 distinct; Done: 64/64
Size of Input TEXTual file: 11,197
-; Word count: 14,968 of them 2,914 distinct; Done: 64/64
Size of Input TEXTual file: 9,752
\; Word count: 16,604 of them 3,078 distinct; Done: 64/64
Size of Input TEXTual file: 12,589
|; Word count: 18,726 of them 3,237 distinct; Done: 64/64
Size of Input TEXTual file: 11,206
/; Word count: 20,545 of them 3,388 distinct; Done: 64/64
Size of Input TEXTual file: 15,374
-; Word count: 22,972 of them 3,601 distinct; Done: 64/64
Size of Input TEXTual file: 15,962
\; Word count: 25,447 of them 3,815 distinct; Done: 64/64
Size of Input TEXTual file: 12,009
|; Word count: 27,328 of them 3,974 distinct; Done: 64/64
Bytes per second performance: 167,210B/s
Words per second performance: 27,328W/s
Flushing unsorted words ...
Time for making unsorted wordlist: 1 second(s)
Deallocated memory in MB: 1170
Allocated memory for words in MB: 1
Allocated memory for pointers-to-words in MB: 1
Sorting(with 'MultiKeyQuickSortX26Sort' by J. Bentley and R. Sedgewick) ...
Sort pass 26/26 ...
Flushing sorted words ...
Time for sorting unsorted wordlist: 1 second(s)
Leprechaun: Done.
D:\_KAZE_new-stuff\Dummy_Check_package_r2>Overlapper-Blender_r1+.exe Dummy_Check.lst.wrd english.dic_351116_wordlist
Overlapper-Blender r.1+, written by Kaze.
Size of 1st input file: 36609
Size of 2nd input file: 4024155
Allocating 1024MB ...
Lines in 1st input file: 3974
Lines in 2nd input file: 351116
Allocated memory for pointers-to-words in MB: 2
Allocated memory for pointers-to-words in MB: 1
Sorting 355090 Pointers ...
Deduplicating duplicates and dumping all into 'Blended.txt' ...
Dumping deduplicated duplicates into 'Overlapped.txt' ...
Dumping all-from-first-file except deduplicated duplicates into 'Unfamiliar.txt' ...
Blended lines, i.e. combined lines from both files: 351623
Overlapped lines, i.e. lines common for both files: 3467
Unfamiliar lines, i.e. lines from 1st file not encountered in 2nd file: 507
D:\_KAZE_new-stuff\Dummy_Check_package_r2>type Unfamiliar.txt
abrm
ada
addenbrooke
addlestone
...
wyllie
wyndham
yockney
yonge
yvonne
zorc
D:\_KAZE_new-stuff\Dummy_Check_package_r2> Actually I could not find any mistakes (in those 507 words from 'Unfamiliar.txt'), but something more ominous than a typo: No (formal) RECOGNITION whatsoever of Samuel Johnson's contribution, caramba! If the