fredreload Posted October 4, 2016 Posted October 4, 2016 And here it is, the long waited algorithm thanks to Staphen that provided the algorithm in Crunchyroll. You are free to test it out on bigger files, currently it only runs for Blank.gif without going too slow. You are free to criticize only if you found a way to make this faster. Starting with iteratedbitcount. Thanks in advance! using System;using System.IO;using System.Collections.Generic;using System.Linq;using System.Numerics;using System.Threading.Tasks;namespace HelloWorld{ class Hello { public static byte[] bytes = System.IO.File.ReadAllBytes(@"E:\data\Blank.gif"); public static byte[] bytes_new; static void Main() { BigInteger number_compress = new BigInteger(bytes); Console.WriteLine(number_compress); Console.WriteLine(); int[] one = new int[50]; int[] zero = new int[50]; int[] position = new int[50]; for (int i = 0; i < 50; i++) { //Console.WriteLine(i); one = IteratedBitcount(number_compress); zero = IteratedZerocount(number_compress); position = one + zero; BigInteger fac = Factorial(one); BigInteger n = FindN(number_compress, one, position); number_compress = n; } Console.WriteLine(number_compress); Console.WriteLine(); BigInteger number_decompress = 0; for (int i = 50-1; i >= 0; i--) { //Console.WriteLine(i); number_decompress = FindNumber(number_compress, one, position); number_compress = number_decompress; } Console.WriteLine(number_decompress); Console.WriteLine(); bytes_new = number_decompress.ToByteArray(); File.WriteAllBytes(@"E:\data\New_Blank.gif", bytes_new); } static BigInteger FindNumber(BigInteger n, int one, int position) { BigInteger number=0; int k = one; for (int j = position-1; j >=0 ; j--) { BigInteger mult = j; BigInteger num = 0; for (BigInteger i = 1; i < (BigInteger)k; i++) { mult *= (j - i); } num = (mult / Factorial(k)); if (num <= n) { number = number + BigInteger.Pow(2, j); k--; n = n - num; } if (k == 0) break; } return number; } static BigInteger FindN(BigInteger number, int one, int position) { BigInteger findMSB = number; BigInteger test = number; var numBits = (int)Math.Ceiling(BigInteger.Log(2)); BigInteger MSB=1; while (findMSB > 0) { MSB <<= 1; findMSB >>= 1; } MSB >>= 1; BigInteger n = 0; int k = one; int pos = position; while (MSB != 0) { if ((test & MSB) > 0) { int a = pos-1; BigInteger mult = a; for (BigInteger i = 1; i < (BigInteger)k; i++) { mult *= (a - i); } n += (mult/Factorial(k)); k--; } MSB >>= 1; pos--; } return n; } static int IteratedBitcount(BigInteger n) { BigInteger test = n; int count = 0; while (test != 0) { if ((test & 1) == 1) { count++; } test >>= 1; } return count; } static int IteratedZerocount(BigInteger n) { BigInteger test = n; int count = 0; while (test != 0) { if ((test & 1)==0) { count++; } test >>= 1; } return count; } static BigInteger Factorial(int arg) { BigInteger value = 1; for (int index = 2; index <= arg; index++) { //Console.WriteLine(index); value *= index; } return value; } }}
Strange Posted October 4, 2016 Posted October 4, 2016 You are free to test it out on bigger files, currently it only runs for Blank.gif without going too slow. What language is this written in? How large is "blank.gif"? Have you checked that the output of compressing and then decompressing the file is identical to the input? Have you tested it with other files of a similar size (e.g. a sequence of random bytes)? You are free to criticize only if you found a way to make this faster. Not if we find it doesn't work? 1
Sensei Posted October 4, 2016 Posted October 4, 2016 (edited) Again, and again, the same.. You have no idea whether it truly works, as you have no decompression function... Writing off data to file, disallows you checking whether every single byte is equal to original (after compressing, then decompressing data). Result from this line of code is nowhere used, as fac is nowhere else referenced: BigInteger fac = Factorial(one); What language is this written in? C# .NET Framework Edited October 4, 2016 by Sensei
fredreload Posted October 5, 2016 Author Posted October 5, 2016 What language is this written in? How large is "blank.gif"? Have you checked that the output of compressing and then decompressing the file is identical to the input? Have you tested it with other files of a similar size (e.g. a sequence of random bytes)? Not if we find it doesn't work? Yes the output is identical to the input Strange . Gonna try optimizing this weekend, let me know if you can help lol Again, and again, the same.. You have no idea whether it truly works, as you have no decompression function... Writing off data to file, disallows you checking whether every single byte is equal to original (after compressing, then decompressing data). Result from this line of code is nowhere used, as fac is nowhere else referenced: BigInteger fac = Factorial(one); C# .NET Framework Cool, I'll take it out, thanks!
fredreload Posted October 5, 2016 Author Posted October 5, 2016 This step is taking the most amount of my time. My best bet is a multiplication table stored in database, anyone else got any idea? The for loop here is taking too long, no for loop for (int i = 1; i < k; i++) { mult *= (a - (BigInteger)i); }
timo Posted October 5, 2016 Posted October 5, 2016 As far as I can see you are calculating mult = a*(a-1)*...*(a-k+1) in that loop. Special cases like k>a+1 and k<2 aside (consider if they can appear in your code), this is the same as mult=a!/k!. You already have a factorial function in your code: Use it, then optimize it with a lookup-table if needed. Note: You are dividing by another k! later on. You only need to calculate it once, of course. 1
Strange Posted October 6, 2016 Posted October 6, 2016 I would still like to see some evidence that the algorithm works, and how well it works. What is the size of the inout file? What is the size of the compressed file? How does that change for (a) random data; (b) ordered data; (c ) all zeroes; etc. Is the result of uncompressing the data identical in all cases?
fredreload Posted October 9, 2016 Author Posted October 9, 2016 Let's see, I got a question first, I converted bytes to BigInteger like this BigInteger number_compress = new BigInteger(bytes.Concat(new byte[] { 0 }).ToArray()); Now the byte array is [378] byte array, as you know byte array is 8 bits each, so that's a total of 3024 bits. The thing is after conversion to number_compress, this BigInteger has 911 base 10 digits, so I'm not sure if BigInteger actually made the value bigger since I can't convert BigInteger to bits in c#. Below is the number. 10511498515079005786001923156952989929760765099887642049490645075407147670074787123453081119964496075798572683908765969937737368112628851278719151548358246532408860460103033076417881911156352613863717173123230737730516444716700786319847260734876596336573683014182274170198577808390323212579797450822781696155303651128091350419064337358263354652969443506081373456328170981909504950042936463577106765595222710454155651057937886003720893239530662179029448821080213785811445808269660735189910921681751787499664128734166678222596180132877088540155530232472178185615057339811168768357276447018187314658464330483545764600882883706505887000156889348127462631364561412835589414227905626001556124920721381180916353004195357239703395912173839712380569892467769276028873585749529029704900277393477308046170568306899216653561856937648186383356499772886802103617644079045242679876437789706198682378675257965584060895470702729
Strange Posted October 9, 2016 Posted October 9, 2016 You can use the BigInterger.ToString method to convert it to hex. Extracting binary from hex is trivial.
fredreload Posted October 9, 2016 Author Posted October 9, 2016 (edited) I used a StringBuilder to append the bits, the compression rate is much better. But I still need to deal with the time = = As far as I can see you are calculating mult = a*(a-1)*...*(a-k+1) in that loop. Special cases like k>a+1 and k<2 aside (consider if they can appear in your code), this is the same as mult=a!/k!. You already have a factorial function in your code: Use it, then optimize it with a lookup-table if needed. Note: You are dividing by another k! later on. You only need to calculate it once, of course. Hmm, the thing is it is (a - i), and it could be something like mult = 15*14*13/5!, I'm interested in using a look up table, possibly a database? But I am a newbie at that, how do you think I should start? Other than the look up table, is there other possible methods? Btw I'm applying the recursive "a" computation I asked here. As an example of the multiplication BigInteger mult = 274341; for (int j = 1; j < 157594; j++){ mult *= (BigInteger)(274341 - j);} P.S. Right I see it, mult=a!/k!, looks like I need a look up table Edited October 9, 2016 by fredreload
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now