GitHub - srijan-Git1247/ML.Net-Application-FileClassifier: Classification of a file as Malicious or Benign using Logistic regression model using the Stochastic Dual Coordinate Ascent (SDCA) method.

The FeatureExtractor Class provides feature extraction for the given folder of files. Once the extraction is complete, the classification and strings data is written out to the "sampledata" file.

The sampledata.csv attached file in the project contains 8 rows of random data. Each of these rows contains two columns worth of data. The first is classification, with true being "malicious" and false being "benign". These properties are mapped in the FileInput Class.

The trainer used in the application uses SDCA using the logistic regression variation.

*Note: If you are looking for sample files, the c:\Windows and c:\Windows\System32 folders contain numerous Windows Executables and DLLS. In addition, if you are looking to create malicious-looking files that are actually clean, you can create files on the fly on http://cwg.io in various file formats.

Run the Console Application with commandline arguments:

Assuming the folder of files called "data_files" exists, execute the following command: data files

D:\Machine Learning Projects\FileClassifier\bin\Debug\net8.0\FileClassifier.exe extract data-files Extracted 8 files to sampledata.csv

Train the model using the sampledata.csv

D:\Machine Learning Projects\FileClassifier\bin\Debug\net8.0\FileClassifier.exe train D:\Machine Learning Projects\FileClassifier\Data\sampledata.csv

A FileClassifier model will be created in then folder D:\Machine Learning Projects\FileClassifier\bin\Debug\net8.0\

Run Prediction on the newly trained model using thw compiled FileClassifier.exe

predict "D:\Machine Learning Projects\FileClassifier\bin\Debug\net8.0\FileClassifier.exe" Based on the file (D:\Machine Learning Projects\FileClassifier\bin\Debug\net8.0\FileClassifier.exe) the file is classified as benign at a confidence level of 4%

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Common		Common
Data		Data
ML		ML
Properties		Properties
.gitattributes		.gitattributes
.gitignore		.gitignore
FileClassifier.csproj		FileClassifier.csproj
FileClassifier.sln		FileClassifier.sln
Program.cs		Program.cs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

srijan-Git1247/ML.Net-Application-FileClassifier

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages