The Mule Posted December 16, 2020 Posted December 16, 2020 Hello @zak100, I think a more important step to take before assembling 100,000 instances of data is bench-marking the performance of some baseline or prototype model that you can train on smaller datasets. Once the major limitation becomes how much data you have, then I would consider the problem of getting a ton of data to be your first priority. However, at the moment, I would think it is more crucial for you to read the articles that me and @Ghideon have discussed. With regard to me assisting you through this process, I meant that I can provide a simple model for you in Keras, not a model for SC vulnerability detection, but rather one to just minimally acquaint you with Keras. Thank you @Ghideon showing me the way to write code on here. Here is the general model pipeline in Keras: 1. Create the instance of your model 2. Compile the model 3. Fit your data to the model 4. Evaluate your model's performance 5. Predict new batches of data or datasets using the saved model weights # Example from some project I did. # data = a pandas dataframe with features from a import pandas as pd import tensorflow as tf import numpy as np import itertools as it import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.linear_model import ElasticNet from sklearn.linear_model import Lasso from sklearn.linear_model import SGDRegressor from sklearn.preprocessing import normalize from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error from tensorflow import keras from tensorflow.keras import layers # the 'target' is what you want to predict # since you do not have the data I used here, this code will not actually do anything when ran, # it purely for illustrative purposes only target = data.pop('Wattage') data = MinMaxScaler().fit_transform(data.values).reshape(len(data),5) # your data would go in the inn dataset = tf.data.Dataset.from_tensor_slices((data[:,:4], data[:,-1].reshape(-1,1))) # create the training, testing, and validation datasets train_size = int(len(data)*0.7) test_size = int(len(data)*0.15) train_dataset = dataset.take(train_size) test_dataset = dataset.skip(train_size) val_dataset = test_dataset.skip(test_size) test_dataset = test_dataset.take(test_size) # THIS IS THE IMPORTANT PART, FOR BUILDING A MODEL # a keras Sequential model with three dense layers, the last being the output layer # in this case we put a '1' for the 'units' parameter because we are predicting one target model = keras.Sequential( [ layers.Dense(units=32, activation='relu', name='layer1'), layers.Dense(units=64, activation='relu', name='layer2'), layers.Dense(units=1, name='end'), ] ) # compile the model with the correct optimizer, loss, and metrics model.compile( optimizer='adam', loss='mse',#tf.keras.losses.MeanSquaredError(reduction="auto", name="mean_squared_error"), metrics=['mse'] ) # fit your model to the training dataset and specify the validation dataset model.fit( x=train_dataset, epochs=20, validation_data=val_dataset, verbose=1, callbacks=[tf.keras.callbacks.EarlyStopping(patience=5)], shuffle=False, ) # evaluate the model's performance model.evaluate( x=test_dataset, verbose=1, callbacks=[tf.keras.callbacks.EarlyStopping(patience=5)], ) # save model for future use, so you do not have to retrain it model.save( filepath='/tmp/trained_on_cleaned_02', ) 1
zak100 Posted December 17, 2020 Author Posted December 17, 2020 Hi @The Mule and @Ghideon-Thanks for discussions and providing me the essential steps for creating a python model. <bench-marking the performance of some baseline or prototype model that you can train on smaller datasets> Yes you are right. First I have to come up with some proto-type mode. <Once the major limitation becomes how much data you have, then I would consider the problem of getting a ton of data to be your first priority. However, at the moment,> Yes, I would appreciate your help as much as possible. I got some idea. I hope once I read the paper, I would know more things. < I would think it is more crucial for you to read the articles that me and @Ghideon have discussed. > Surely I would look at the articles which you (@The Mule )and @Ghideon have pointed. God blesses you. Zulfi. 1
Ghideon Posted December 17, 2020 Posted December 17, 2020 (edited) 22 hours ago, The Mule said: I think a more important step to take before assembling 100,000 instances of data is bench-marking the performance of some baseline or prototype model that you can train on smaller datasets. Once the major limitation becomes how much data you have, then I would consider the problem of getting a ton of data to be your first priority. For the paper Zak100 provided in opening post my understanding was that they provided experimental evidence that their approach was working and had good performance. Hence I recommended checking that there was possible to get hold of data first. But your contributions allows for a shift of focus as you suggested above; there are now multiple ways to get data and possible alternative approaches. I support your approach to start with prototype model, unless Zak still wants to pursue exactly the approach described in the paper in OP. 22 hours ago, The Mule said: However, at the moment, I would think it is more crucial for you to read the articles that me and @Ghideon have discussed. I agree. 22 hours ago, The Mule said: one to just minimally acquaint you with Keras @zak100 Here is a quick attempt at providing a little quiz (Hope @The Mule corrects me if I get this wrong) The model in the Keras code example provided by The Mule differs slightly from your initial question and the approach in the paper. Which Keras class may be a reasonable starting point for the kind of neural network that the paper* in your first post use? *) The paper in opening post: Towards Safer Smart Contracts: A Sequence Learning Approach to Detecting Security Threats Edited December 17, 2020 by Ghideon 1
The Mule Posted December 17, 2020 Posted December 17, 2020 1 hour ago, Ghideon said: The model in the Keras code example provided by The Mule differs slightly from your initial question and the approach in the paper In fact, the Python code that I inserted above is quite different from the methods employed in the paper. The code I added was a very rudimentary presentation of what the structure of something written using Keras may appear like. Nonetheless, I agree with @Ghideon that @zak100 should attempt to piece together which major components of Keras correspond to the methods used in the paper. As an example, if the paper mentioned a Convolutional Neural Network, then two appropriate answers would be to look at (1) Google search: keras layers -> https://keras.io/api/layers/ -> look at convolutional layers (WHAT IT LOOKS LIKE ON THE WEBSITE) or (2) keras CNN tutorial -> https://victorzhou.com/blog/keras-cnn-tutorial/ 1
zak100 Posted December 17, 2020 Author Posted December 17, 2020 Hi my friends- thanks a lot. Right now I am struggling with Maian tool. Get back with you people soon. If I can't execute it we have to adopt a different approach. Possibly I have to look at other papers, @The Mule has posted. Zulfi.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now