Connect and share knowledge within a single location that is structured and easy to search. I am looking for large text files for testing the compression and decompression in all sizes from 1kb to mb. Can someone please refer me to download it from some link?
You can download enwik8 and enwik9 from here. They are respectively ,, and 1,,, bytes of text for compression benchmarks. You can always pull subsets of those for smaller tests. This command will generate a text file that will contain , lines of random text and look like this:. On my Ubuntu 18 its size it about 10MB. Bumping up the number of lines, and thereby bumping up the size, is easy. Just increase the head -n part. So, say, this command:.
On my commodity hardware the latter command takes about 3 seconds to finish. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. Looking for large text files for testing compression in all sizes Ask Question. Asked 4 years, 5 months ago. Active 8 months ago. Viewed 46k times. Siranjeevi Rajendran Siranjeevi Rajendran 1 1 gold badge 3 3 silver badges 8 8 bronze badges. Add a comment. Active Oldest Votes.
Phillip Williams Phillip Williams 1 1 silver badge 10 10 bronze badges. Mark Adler Mark Adler  Thanks, it is a nice command. But how can I know the headcount for each size?? It would depend on the length of the line you're putting in.
Thanks for sharing. If you increase the bold number by 1, you can double the size of the file. If you decrease it by 1 you will reduce the file size by half. Would you please explain how this calculated?
Would you please shed some light on this calculation? Answers: 1. The script initially writes the text specified in the command to the file.
And each time the type command appends the file content in dummy. So the size of the file doubles with each iteration. So if you decrease the number of characters in the initial file, automatically the size of the final file will also be lesser.
However, the file size never grow greater than 4GB. I thought it might be a Windows limitation so I used another tool to create the GB file. That tool successfully created the big GB file. Maybe this append file method has a limitation at 4GB? Its maximum file size is 4GB, so the file more than 4 GB will not be created.
Is there a way to loop it to make a multiple large files? You can do that. First create a file with the above commands and then create required number of files using copy command. For anyone putting the command into a batch file you need to use double percentages to make it work, eg,. Hi Great little article. For example, if you are a system administrator and are deploying a new file replication software, you may want to evaluate the software if it works for all scenarios.
For this, you can create files of varying sizes and test the software before the actual deployment. This will create a highly compressible file since the same data is repeated over and over. Also if you want to do performance testing, caching could skew your results if its the same bits being loaded over and over. Can some explain this please step by step:. I am not understanding those numbers. Can someone give me an example please.
I could not follow the above example. You can use dummy. Hi, I have a requirement to create DOCX and PDF files with varying sizes with valid data in it , can have random text but it needs to open successfully i. Any generatingg tools out there that can do the job quickly? If any one suggest a tool well appreciated.
Thanks in Advance. TTS: Have you found any solution for this? I needed a 1 GB file. Was this intentional? When doing things like upload test or download test, what matters is the actual size of the file, not how much space is allocated for the file on the disk. The operating systems and file system on the target machine might be completely different from the source machine. So what impact does this have? So this needs to be corrected for.
Other than that, very clever trick! The resulting file is only 62 byte! The file should now be 64 byte. So when you use the second command you will now get exactly 1 MB for the first example.
Thanks for pointing that out, I corrected the command. Something went wrong looks like when I added the command in the post, I usually verify results are accurate.
For example one place to save this without needing admin rights is the public pictures folder. Thanks for the script. Tried this on Windows 7, Windows Server and R2. The problem seems to be the script. Hi, Thanks a lot. This page is really informative.
0コメント