Swudu Susuwu Posted March 11, 2024 Posted March 11, 2024 To undo threatware/spyware/ransomware/malware for us, would you add artificial neural networks (artificial CNS) to virus scanners? Github has lots of FLOSS (Open Source Softwares) simulaters of CNS (at https://github.com/topics/artificial-neural-network , such as https://github.com/Rober-t/apxr_run/ ), which virus scanners could use to do this: Just have training data inputs = samples of infected files/programs, and outputs = original files/programs (or end-of-file for samples are just an infection and have no original files/programs to revert to), to produce artificial CNS to undo infections from files/programs Assume most antivirus programs have heuristical analysis and sandboxes, but if not here is how to do this: Search for open source (FLOSS) virus scanners (https://github.com/topics/virus-scanner has lots such as https://github.com/kylefarris/clamscan) and look at how those scan executables to figure out what programs do with your OS Most look for OS opcodes (such as “int” or “syscall”) or look at what libraries the programs load and search for instructions such as “jmp” or “call” that goto system libraries, to flag programs that alter other programs and flag programs that alter page flags to have W+X (lots of malware alters pages to have both writable and executable access, so virus scanners block such programs) To figure out what libraries a program loads, refer to the specifications of the OS’s executable format -- “Portable Executable” for Windows ( https://learn.microsoft.com/en-us/windows/win32/debug/pe-format https://wikipedia.org/wiki/Portable_Executable ), “Extended Linker Format” for most others such as UNIX and Linuxes ( https://wikipedia.org/wiki/Executable_and_Linkable_Format ) which would allow you to know what libraries a program loads at startup, plus those libraries’ functions’ addresses. Virus scanners should also look at dynamic loads of functions ( https://www.codeproject.com/Questions/338807/How-to-get-list-of-all-imported-functions-invoked ) such as from GetProcAddress, or just flag functions such as GetProcAddress For virus scanners to have better heuristical analysis, should flag programs that perform raw network accesses (versus OS functions to download/upload files), or that alter files of which the program is not the owner Some of this requires that you not just look at what functions the program calls, but also look at (if just constant parameters) or guess (if registers/addresses as parameters, antiviruses should use sandboxes or just flag all non-constant parameters to sensitive functions) what parameters the program passes to those functions Example outputs from Fdroid through sandbox/analysis: https://www.virustotal.com/gui/file/dc3bb88f6419ee7dde7d1547a41569aa03282fe00e0dc43ce035efd7c9d27d75 https://www.virustotal.com/ui/file_behaviours/dc3bb88f6419ee7dde7d1547a41569aa03282fe00e0dc43ce035efd7c9d27d75_VirusTotal R2DBox/html https://www.virustotal.com/ui/file_behaviours/dc3bb88f6419ee7dde7d1547a41569aa03282fe00e0dc43ce035efd7c9d27d75_Zenbox/html The false positive outputs (from Zenbox) show the purpose of manual reviews for programs that your sandbox flags
swansont Posted March 11, 2024 Posted March 11, 2024 ! Moderator Note Your proposals don’t seem to contain the level of detail one needs to have a discussion. It is not enough to speculate e.g. “Assume most antivirus programs have heuristical analysis and sandboxes” If you know these details and can discuss them, then do so. If not, there’s no discussion to be had. This isn’t a place to present vague, hand-wavy propositions.
Swudu Susuwu Posted March 11, 2024 Author Posted March 11, 2024 (edited) 4 hours ago, swansont said: ! Moderator Note Your proposals don’t seem to contain the level of detail one needs to have a discussion. It is not enough to speculate e.g. “Assume most antivirus programs have heuristical analysis and sandboxes” For the scanners with heuristical analysis and sandboxes, the next logical move is to add artificial CNS. Not all scanners have such analysis and scanners, but clear that some such as Virustotal do, as the Urls show us. Earlier urls show Virustotal's heuristical analysis of Fdroid's package manager and review its behaviour through two sandboxes. A POSIX OS such as Linux has "chroot()" (run `man chroot` for instructions) so that the programs you test cannot alter stuff out of the test, and has "strace()" (run `man strace` for instructions, or look at https://opensource.com/article/19/10/strace https://www.geeksforgeeks.org/strace-command-in-linux-with-examples/ ) which hooks all system calls and saves logs, simple sandboxes just launch programs for a few seconds and dump such logs, with additional heuristics to guess which calls go to logs so reviewers have to look through less. Autonomous sandboxes use the ideas from first post, to flag programs that do system calls that would alter resources that are not part of the program under test. Heuristical analysis is similar to Clang/LLVM's static analysis tools (static analysis checks programs for accidental security threats such as bufferr over-runs/under-runs,) and you could use the FLOSS static analysis tools as a first basis for virus scanners, just add checks for deliberate security threats and flag those for manual reviews and warn not to run such programs before the reviews. https://github.com/llvm/llvm-project/blob/main/clang/lib/StaticAnalyzer is a FLOSS basis for analysis, has uses for virus scanners. As for the artificial neurons/CNS, those are as simple to use for this as the original post says. If you want, would not require too much effort to do this, but who has access to large sample databases for the artificial CNS? Edited March 11, 2024 by Swudu Susuwu Remove extra spaces, add resources
Swudu Susuwu Posted March 11, 2024 Author Posted March 11, 2024 https://github.com/llvm/llvm-project/blob/main/clang/lib/StaticAnalyzer is part of LLVM and is a FLOSS basis for analysis, but if you don't want LLVM, https://github.com/secure-software-engineering/phasar has just the analysis.
swansont Posted March 11, 2024 Posted March 11, 2024 2 hours ago, Swudu Susuwu said: For the scanners with heuristical analysis and sandboxes, the next logical move is to add artificial CNS ! Moderator Note You don’t appear to be discussing how to do this. It would be like saying physics problems should be solved by applying the appropriate equations and doing the math. A generally true statement that does nothing to foster discussion. The devil’s in the details. You need to be providing those details.
Swudu Susuwu Posted March 12, 2024 Author Posted March 12, 2024 (edited) On 3/11/2024 at 2:36 PM, swansont said: ! Moderator Note You don’t appear to be discussing how to do this. It would be like saying physics problems should be solved by applying the appropriate equations and doing the math. A generally true statement that does nothing to foster discussion. The devil’s in the details. You need to be providing those details. Quote Just have training data inputs = samples of infected files/programs, and outputs = original files/programs (or end-of-file for samples are just an infection and have no original files/programs to revert to), to produce artificial CNS to undo infections from files/programs. Quote If you want, would not require too much effort to do this, but who has access to large sample databases for the artificial CNS All that lacks are the structures of the sample databases. If you post Urls to such databases, could produce this in under 2 hours. This is not particular to ANN/CNS virus scanners but, rather, to all ANN/CNS problems; their sole limitation is CPU resources to crunch the databases, and access to huge databases. Examples of howto setup APXR as artificial CNS; https://github.com/Rober-t/apxr_run/blob/master/src/examples/ Examples of howto setup HSOM as artificial CNS; https://github.com/CarsonScott/HSOM/tree/master/examples Simple to setup once you have access to databases. Edited March 12, 2024 by Swudu Susuwu Add examples
Swudu Susuwu Posted March 19, 2024 Author Posted March 19, 2024 (edited) Oops, Clamscan is just a wrapper, for examples of FLOSS virus scanners, goto https://github.com/Cisco-Talos/clamav/ With regards to new CNS virus scanners: Just as (if humans grew trillions of neurons plus thousands of layers of cortices) one of us could pour through all databases of infections (plus samples of fresh programs) to setup our synapses to revert (from hex dumps) all infections to fresh programs, so too could artificial CNS with trillions of artificial neurons do this. Edited March 19, 2024 by Swudu Susuwu
swansont Posted March 19, 2024 Posted March 19, 2024 ! Moderator Note You need to go start a blog somewhere. All you’re doing here is soapboxing, which is against the rules. 1
Recommended Posts