majeedk

Members

View Profile See their activity

Posts
2
Joined
September 2, 2013
Last visited
September 5, 2013

Content Type

All Activity

Profiles

Forums

Topics
Posts

Events

Everything posted by majeedk

K-means clustering for usage profiling

majeedk replied to majeedk's topic in Computer Science

Every user of a mobile device has a unique usage pattern combined with other factors like time of day/week, their location etc. Use clustering to model/capture every variation of their usage behaviour. Future readings for the same user, should fit into the clusters created with training data. If I take readings from a different user (USER B) and try to fit into profile of USER A, it should be different. If there is some kind of malware on the device etc, it should alter the usage pattern and hence it should not fit into existing profile This usage profiling is part for my PhD research... I hope it helps
- September 2, 2013
- 2 replies
K-means clustering for usage profiling

majeedk posted a topic in Computer Science

I am trying to use k-means clustering to profile usage behaviour for mobile device users. My data consists of different system and user level variable/readings like number of calls/sms, cpu/memory usage, number of users and system applications/services etc. The readings are taken every 5 minutes from mobile device and scaled between 0-100. The clustering is done in MatLab on computer. The idea I have is to use say 1 month's data for training, i.e. clustering, and then use the future data to compare with existing clusters and try to find (dis)similarity between the two. The assumption is different users will have different usage; hence readings from USER B will not fit into clusters from USER A. Now two questions I have: After training (clustering), how do I compare new data with existing clusters to determine (dis)similarity, i.e. new data belongs to same user or not? I am thinking of finding nearest cluster and then checking if the point lies within this cluster's boundary. I am using Silhouettes plot to determine the clustering quality. I get some negative values e.g see the attached figure.. I have read that A negative value means that the record is more similar to the records of its neighbouring cluster than to other members of its own cluster. Shall I be concerned with my results? or Is it normal to have some negative values? If it needs to be fixed How do I detect the readings causing this problem.
- September 2, 2013
- 2 replies

Sign In

majeedk

Posts

Joined

Last visited

Content Type

Profiles

Forums

Events

Everything posted by majeedk

K-means clustering for usage profiling

K-means clustering for usage profiling

Browse

Activity

Important Information