Machine Learning Tool for Smart Contracts

The Mule · December 16, 2020

Hello @zak100, I think a more important step to take before assembling 100,000 instances of data is bench-marking the performance of some baseline or prototype model that you can train on smaller datasets. Once the major limitation becomes how much data you have, then I would consider the problem of getting a ton of data to be your first priority. However, at the moment, I would think it is more crucial for you to read the articles that me and @Ghideon have discussed. With regard to me assisting you through this process, I meant that I can provide a simple model for you in Keras, not a model for SC vulnerability detection, but rather one to just minimally acquaint you with Keras.

Thank you @Ghideon showing me the way to write code on here.

Here is the general model pipeline in Keras:

1. Create the instance of your model

2. Compile the model

3. Fit your data to the model

4. Evaluate your model's performance

5. Predict new batches of data or datasets using the saved model weights

# Example from some project I did. 
# data = a pandas dataframe with features from a 

import pandas as pd
import tensorflow as tf
import numpy as np
import itertools as it
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import ElasticNet
from sklearn.linear_model import Lasso
from sklearn.linear_model import SGDRegressor
from sklearn.preprocessing import normalize
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from tensorflow import keras
from tensorflow.keras import layers

# the 'target' is what you want to predict
# since you do not have the data I used here, this code will not actually do anything when ran, 
# it purely for illustrative purposes only
target = data.pop('Wattage')
data = MinMaxScaler().fit_transform(data.values).reshape(len(data),5)

# your data would go in the inn
dataset = tf.data.Dataset.from_tensor_slices((data[:,:4], data[:,-1].reshape(-1,1)))

# create the training, testing, and validation datasets
train_size = int(len(data)*0.7)
test_size = int(len(data)*0.15)
train_dataset = dataset.take(train_size)
test_dataset = dataset.skip(train_size)
val_dataset = test_dataset.skip(test_size)
test_dataset = test_dataset.take(test_size)

# THIS IS THE IMPORTANT PART, FOR BUILDING A MODEL
# a keras Sequential model with three dense layers, the last being the output layer
# in this case we put a '1' for the 'units' parameter because we are predicting one target
model = keras.Sequential(
    [
        layers.Dense(units=32, activation='relu', name='layer1'),
        layers.Dense(units=64, activation='relu', name='layer2'),
        layers.Dense(units=1, name='end'),
    ]
)

# compile the model with the correct optimizer, loss, and metrics
model.compile(
    optimizer='adam',
    loss='mse',#tf.keras.losses.MeanSquaredError(reduction="auto", name="mean_squared_error"),
    metrics=['mse']
)

# fit your model to the training dataset and specify the validation dataset
model.fit(
    x=train_dataset,
    epochs=20,
    validation_data=val_dataset,
    verbose=1,
    callbacks=[tf.keras.callbacks.EarlyStopping(patience=5)],
    shuffle=False,
)

# evaluate the model's performance
model.evaluate(
    x=test_dataset,
    verbose=1,
    callbacks=[tf.keras.callbacks.EarlyStopping(patience=5)],
)

# save model for future use, so you do not have to retrain it
model.save(
    filepath='/tmp/trained_on_cleaned_02',
)

zak100 · December 17, 2020

Hi @The Mule and @Ghideon-Thanks for discussions and providing me the essential steps for creating a python model.

<bench-marking the performance of some baseline or prototype model that you can train on smaller datasets>

Yes you are right. First I have to come up with some proto-type mode.

Yes, I would appreciate your help as much as possible.

I got some idea. I hope once I read the paper, I would know more things.

< I would think it is more crucial for you to read the articles that me and @Ghideon have discussed. >

Surely I would look at the articles which you (@The Mule )and @Ghideon have pointed.

God blesses you.

Zulfi.

Ghideon · December 17, 2020

22 hours ago, The Mule said:

I think a more important step to take before assembling 100,000 instances of data is bench-marking the performance of some baseline or prototype model that you can train on smaller datasets. Once the major limitation becomes how much data you have, then I would consider the problem of getting a ton of data to be your first priority.

For the paper Zak100 provided in opening post my understanding was that they provided experimental evidence that their approach was working and had good performance. Hence I recommended checking that there was possible to get hold of data first. But your contributions allows for a shift of focus as you suggested above; there are now multiple ways to get data and possible alternative approaches. I support your approach to start with prototype model, unless Zak still wants to pursue exactly the approach described in the paper in OP.

22 hours ago, The Mule said:

However, at the moment, I would think it is more crucial for you to read the articles that me and @Ghideon have discussed.

I agree.

22 hours ago, The Mule said:

one to just minimally acquaint you with Keras

@zak100 Here is a quick attempt at providing a little quiz
(Hope @The Mule corrects me if I get this wrong)

The model in the Keras code example provided by The Mule differs slightly from your initial question and the approach in the paper. Which Keras class may be a reasonable starting point for the kind of neural network that the paper* in your first post use?

*) The paper in opening post: Towards Safer Smart Contracts: A Sequence Learning Approach to Detecting Security Threats

Edited December 17, 2020 by Ghideon

The Mule · December 17, 2020

1 hour ago, Ghideon said:

The model in the Keras code example provided by The Mule differs slightly from your initial question and the approach in the paper

In fact, the Python code that I inserted above is quite different from the methods employed in the paper. The code I added was a very rudimentary presentation of what the structure of something written using Keras may appear like. Nonetheless, I agree with @Ghideon that @zak100 should attempt to piece together which major components of Keras correspond to the methods used in the paper. As an example, if the paper mentioned a Convolutional Neural Network, then two appropriate answers would be to look at (1) Google search: keras layers -> https://keras.io/api/layers/ -> look at convolutional layers

(WHAT IT LOOKS LIKE ON THE WEBSITE)

38267893_ScreenShot2020-12-17at6_06_09PM.png.58d11cd72240f76aba9158ae082c66a7.png

or (2) keras CNN tutorial -> https://victorzhou.com/blog/keras-cnn-tutorial/

zak100 · December 17, 2020

Hi my friends- thanks a lot.

Right now I am struggling with Maian tool.

Get back with you people soon. If I can't execute it we have to adopt a different approach. Possibly I have to look at other papers, @The Mule has posted.

Zulfi.

Sign In

Machine Learning Tool for Smart Contracts

Recommended Posts

The Mule

zak100

Ghideon

The Mule

zak100

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Important Information