Sunday, 8 January 2017

IEEE-2019: Conundrum-Pass: A New Graphical Password Approach

IEEE-2019: Conundrum-Pass: A New Graphical Password Approach
Abstract: Graphical passwords are most widely used as a mechanism for authentication in today's mobile computing environment. This methodology was introduced to enhance security element and overcome the vulnerabilities of textual passwords, pins, or other trivial password methodologies which were difficult to remember and prone to external attacks. There are many graphical password schemes that are proposed over time, however, most of them suffer from shoulder surfing and could be easily guessed which is quite a big problem. The proposed technique in this paper allows the user to keep the ease-to-use property of the pattern lock while minimizing the risk of shoulder surfing and password guessing. The proposed technique allows the user to divide a picture into multiple chunks and while unlocking, selecting the previously defined chunks results successfully in unlocking the device. This technique can effectively resist the shoulder surfing and smudge attacks, also it is resilient to password guessing or dictionary attacks. The proposed methodology can significantly improve the security of the graphical password system with no cost increase in terms of unlocking time.



IEEE-2019: Secure and Efficient Skyline Queries on Encrypted Data
Abstract: Outsourcing data and computation to cloud server provides a cost-effective way to support large scale data storage and query processing. However, due to security and privacy concerns, sensitive data (e.g., medical records) need to be protected from the cloud server and other unauthorized users. One approach is to outsource encrypted data to the cloud server and have the cloud server perform query processing on the encrypted data only. It remains a challenging task to support various queries over encrypted data in a secure and efficient way such that the cloud server does not gain any knowledge about the data, query, and query result. In this paper, we study the problem of secure skyline queries over encrypted data. The skyline query is particularly important for multi-criteria decision making but also presents significant challenges due to its complex computations. We propose a fully secure skyline query protocol on data encrypted using semantically-secure encryption. As a key subroutine, we present a new secure dominance protocol, which can be also used as a building block for other queries. Furthermore, we demonstrate two optimizations, data partitioning and lazy merging, to further reduce the computation load. Finally, we provide both serial and parallelized implementations and empirically study the protocols in terms of efficiency and scalability under different parameter settings, verifying the feasibility of our proposed solutions.



IEEE 2018: Human Identification From Freestyle Walks Using Posture-Based Gait Feature 
Abstract: With the increase of terrorist threats around the world, human identification research has become a sought after area of research. Unlike standard biometric recognition techniques, gait recognition is a non-intrusive technique. Both data collection and classification processes can be done without a subject’s cooperation. In this work, we propose a new model-based gait recognition technique called postured-based gait recognition. It consists of two elements: posture-based features and posture-based classification. Posture-based features are composed of displacements of all joints between current and adjacent frames and Center-of-Body (CoB) relative coordinates of all joints, where the coordinates of each joint come from its relative position to four joints: hip-center, hip-left, hip-right, and spine joints, from the front forward. The CoB relative coordinate system is a critical part to handle the different observation angle issue. In posture-based classification, postured-based gait features of all frames are considered. The dominant subject becomes a classification result. The postured-based gait recognition technique outperforms existing techniques in both fixed direction and freestyle walk scenarios where turning around and changing directions are involved. This suggests that a set of postures and quick movements are sufficient to identify a person. The proposed technique also performs well under the gallery-size test and the cumulative match characteristic test, which implies that the postured-based gait recognition technique is not gallery-size sensitive and is a good potential tool for forensic and surveillance use.
Click for more details

IEEE 2018: A Data Mining based Model for Detection of Fraudulent Behaviour in Water Consumption
Abstract: Fraudulent behavior in drinking water consumption is a significant problem facing water supplying companies and agencies. This behavior results in a massive loss of income and forms the highest percentage of non-technical loss. Finding efficient measurements for detecting fraudulent activities has been an active research area in recent years. Intelligent data mining techniques can help water supplying companies to detect these fraudulent activities to reduce such losses. This research explores the use of two classification techniques (SVM and KNN) to detect suspicious fraud water customers. The main motivation of this research is to assist Yarmouk Water Company (YWC) in Irbid city of Jordan to overcome its profit loss. The SVM based approach uses customer load profile attributes to expose abnormal behavior that is known to be correlated with non-technical loss activities. The data has been collected from the historical data of the company billing system. The accuracy of the generated model hit a rate of over 74% which is better than the current manual prediction procedures taken by the YWC. To deploy the model, a decision tool has been built using the generated model. The system will help the company to predict suspicious water customers to be inspected on site.

IEEE 2018: Machine Learning Methods for Disease Prediction with Claims Data 
Abstract: One of the primary challenges of healthcare delivery is aggregating disparate, asynchronous data sources into meaningful indicators of individual health. We combine natural language word embedding and network modeling techniques to learn meaningful representations of medical concepts by using the weighted network adjacency matrix in the GloVe algorithm, which we call Code2Vec. We demonstrate that using our learned embeddings improve neural network performance for disease prediction. However, we also demonstrate that popular deep learning models for disease prediction are not meaningfully better than simpler, more interpretable classifiers such as XGBoost. Additionally, our work adds to the current literature by providing a comprehensive survey of various machine learning algorithms on disease prediction tasks.

IEEE 2017: NetSpam: a Network-based Spam Detection Framework for Reviews in Online Social Media
Abstract: Nowadays, a big part of people rely on available content in social media in their decisions (e.g. reviews and feedback on a topic or product). The possibility that anybody can leave a review provide a golden opportunity for spammers to write spam reviews about products and services for different interests. Identifying these spammers and the spam content is a hot topic of research and although a considerable number of studies have been done recently toward this end, but so far the methodologies put forth still barely detect spam reviews, and none of them show the importance of each extracted feature type. In this study, we propose a novel framework, named NetSpam, which utilizes spam features for modeling review datasets as heterogeneous information networks to map spam detection procedure into a classification problem in such networks. Using the importance of spam features help us to obtain better results in terms of different metrics experimented on real-world review datasets from Yelp and Amazon websites. The results show that NetSpam outperforms the existing methods and among four categories of features; including review-behavioral, user-behavioral, reviewlinguistic, user-linguistic, the first type of features performs better than the other categories.

IEEE 2017: One-time Password for Biometric Systems: Disposable Feature Templates
Abstract:Biometric access control systems are becoming more commonplace in society. However, these systems are susceptible to replay attacks. During a replay attack, an attacker can capture packets of data that represents an individual’s biometric. The attacker can then replay the data and gain unauthorized access into the system. Traditional password based systems have the ability to use a one-time password scheme. This allows for a unique password to authenticate an individual and it is then disposed. Any captured password will not be effective. Traditional biometric systems use a single feature extraction method to represent an individual, making captured data harder to change than a password. There are hashing techniques that can be used to transmute biometric data into a unique form, but techniques like this require some external dongle to work successfully. The proposed technique in this work can uniquely represent individuals with each access attempt. The amount of unique representations will be further increased by a genetic feature selection technique that uses a unique subset of biometric features. The features extracted are from an improved geneticbased extraction technique that performed well on periocular images. The results in this manuscript show that the improved extraction technique coupled with the feature selection technique has an improved identification performance compared with the traditional genetic based extraction approach. The features are also shown to be unique enough to determine a replay attack is occurring, compared with a more traditional feature extraction technique.
Click for more details

IEEE 2016: A Shoulder Surfing Resistant Graphical Authentication System 
Abstract: Authentication based on passwords is used largely in applications for computer security and privacy. However, human actions such as choosing bad passwords and inputting passwords in an insecure way are regarded as” the weakest link” in the authentication chain. Rather than arbitrary alphanumeric strings, users tend to choose passwords either short or meaningful for easy memorization. With web applications and mobile apps piling up, people can access these applications anytime and anywhere with various devices. This evolution brings great convenience but also increases the probability of exposing passwords to shoulder surfing attacks. Attackers can observe directly or use external recording devices to collect users’ credentials. To overcome this problem, we proposed a novel authentication system PassMatrix, based on graphical passwords to resist shoulder surfing attacks. With a one-time valid login indicator and circulative horizontal and vertical bars covering the entire scope of pass-images, PassMatrix offers no hint for attackers to figure out or narrow down the password even they conduct multiple camera-based attacks. We also implemented a PassMatrix prototype on Android and carried out real user experiments to evaluate its memorability and usability. From the experimental result, the proposed system achieves better resistance to shoulder surfing attacks while maintaining usability.

Thursday, 5 January 2017

IEEE-2019: Improving Heart Disease Prediction Using Feature Selection Approaches

IEEE-2019: Improving Heart Disease Prediction Using Feature Selection Approaches
Abstract: Heart Disease is the disorder of heart and blood veins. It is very difficult for medical practitioners and doctors to predict accurate about heart disease diagnosis. Data science is one of the more important things in early prediction and solves large data problems now days. This research paper describes the prediction of heart disease in medical field by using data science. As many researches done research related to that problem but the accuracy of prediction is still needed to be improved. So, this research focuses on feature selection techniques and algorithms where multiple heart disease datasets are used for experimentation analysis and to show the accuracy improvement. By using the Rapid miner as tool; Decision Tree, Logistic Regression, Logistic Regression SVM, Naïve Bayes and Random Forest; algorithms are used as feature selection techniques and improvement is shown in the results by showing the accuracy.

IEEE-2019: Disease Influence Measure Based Diabetic Prediction with Medical Data Set Using Data Mining
Abstract: The problem of diabetic prediction has been well studied in this paper. The disease predictions have been explored using various methods of data mining. The use of medical data set on the prediction of diabetic mellitus has been analyzed. This paper performs a detailed survey on disease prediction using data mining approaches based on diabetic data set. The presence of disease has been identified using the appearance of various symptoms. However, the methods use different features and produces varying accuracy. The result of prediction differs with the methods/measures/ features being used. Towards diabetic prediction, a Disease Influence Measure (DIM) based diabetic prediction has been presented. The method preprocesses the input data set and removes the noisy records. In the second stage, the method estimates disease influence measure (DIM) based on the features of input data point. Based on the DIM value, the method performs diabetic prediction. Different approaches of disease prediction have been considered and their performance in disease prediction has been compared. The analysis result has been presented in detail towards the development.




IEEE-2018: A Novel Mechanism for Fast Detection of Transformed Data Leakage
Abstract: Data leakage is a growing insider threat in information security among organizations and individuals. A series of methods have been developed to address the problem of data leakage prevention (DLP). However, large amounts of unstructured data need to be tested in the Big Data era. As the volume of data grows dramatically and the forms of data become much complicated, it is a new challenge for DLP to deal with large amounts of transformed data. We propose an Adaptive weighted Graph Walk model (AGW) to solve this problem by mapping it to the dimension of weighted graphs. Our approach solves this problem in three steps. First, the adaptive weighted graphs are built to quantify the sensitivity of tested data based on its context. Then, the improved label propagation is used to enhance the scalability for fresh data. Finally, a low-complexity score walk algorithm is proposed to determine the ultimate sensitivity. Experimental results show that the proposed method can detect leaks of transformed or fresh data fast and efficiently.



IEEE-2018: Machine Learning Methods for Disease Prediction with Claims Data 
 Abstract: One of the primary challenges of healthcare delivery is aggregating disparate, asynchronous data sources into meaningful indicators of individual health. We combine natural language word embedding and network modeling techniques to learn meaningful representations of medical concepts by using the weighted network adjacency matrix in the GloVe algorithm, which we call Code2Vec. We demonstrate that using our learned embeddings improve neural network performance for disease prediction. However, we also demonstrate that popular deep learning models for disease prediction are not meaningfully better than simpler, more interpretable classifiers such as XGBoost. Additionally, our work adds to the current literature by providing a comprehensive survey of various machine learning algorithms on disease prediction tasks.




IEEE 2017: Privacy and Secure Medical Data Transmission and Analysis for Wireless Sensing Healthcare System
Abstract :The convergence of Internet of Things (IoT), cloud computing and wireless body-area networks (WBANs) has greatly promoted the industrialization of e-/m-healthcare (electronic-/mobile-healthcare). However, the further flourishing of e-/m-Healthcare still faces many challenges including information security and privacy preservation. To address these problems, a healthcare system (HES) framework is designed that collects medical data from WBANs, transmits them through an extensive wireless sensor network infrastructure and finally publishes them into wireless personal area networks (WPANs) via a gateway. Furthermore, HES involves the GSRM (Groups of Send-Receive Model) scheme to realize key distribution and secure data transmission, the HEBM (Homomorphic Encryption Based on Matrix) scheme to ensure privacy and an expert system able to analyze the scrambled medical data and feed back the results automatically.

IEEE 2017: Privacy-Preserving Location-Proximity for Mobile Apps

Abstract :Location Based Services (LBS) have seen alarming privacy breaches in recent years. While there has been much recent progress by the research community on developing privacy-enhancing mechanisms for LBS, their evaluation has been often focused on the privacy guarantees, while the question of whether these mechanisms can be adopted by practical LBS applications has received limited attention. This paper studies the applicability of Privacy-Preserving Location Proximity (PPLP) protocols in the setting of mobile apps. We categorize popular location social apps and analyze the tradeoffs of privacy and functionality with respect to PPLP enhancements. To investigate the practical performance trade-offs, we present an in-depth case study of an Android application that implements Inner Circle, a state-of-the-art protocol for privacy preserving location proximity. This study indicates that the performance of the privacy-preserving application for coarsegrained precision is comparable to real applications with the same feature set.
Click for more details




IEEE 2017: IoT based Home Security through Digital Image Processing Algorithms
Abstract- This paper gives an outline for automatic system to control and secure the home, based on digital image processing with the help of Internet of Things (IoT). The system consists of a sensor, digital camera, database in the fog and the mobile phone. Sensors are placed in the frame of the door which alerts camera, to capture an image who intends to enter the house, then sends the image to the database or dataset that is stored in the fog. Image analysis is performed to detect and recognize and match the image with the stored dataset of the authenticated people or pets. If the image captured does not match with the dataset then an alert message is send to the owner of the house. The image processing algorithms are considered for the processing spatial and time complexity of the image captured to cross check with the dataset stored in the fog.
Click for more details

Tuesday, 3 January 2017

IEEE 2017: SociRank: Identifying and Ranking Prevalent NewsTopics Using Social Media Factors

Abstract: Mass media sources, specifically the news media, have traditionally informed us of daily events. In modern times, social media services such as Twitter provide an enormous amount of user-generated data, which have great potential to contain informative news-related content. For these resources to be useful, we must find a way to filter noise and only capture the content that, based on its similarity to the news media, is considered valuable. However, even after noise is removed, information overload may still exist in the remaining data—hence, it is convenient to prioritize it for consumption. To achieve prioritization, information must be ranked in order of estimated importance considering three factors. First, the temporal prevalence of a particular topic in the news media is a factor of importance, and can be considered the media focus (MF) of a topic. Second, the temporal prevalence of the topic in social media indicates its user attention (UA). Last, the interaction between the social media users who mention this topic indicates the strength of the community discussing it, and can be regarded as the user interaction (UI) toward the topic. We propose an unsupervised framework—SociRank—which identifies news topics prevalent in both social media and the news media, and then ranks them by relevance using their degrees of MF, UA, and UI. Our experiments

IEEE 2017: RAPARE: A Generic Strategy for Cold-Start Rating Prediction Problem
Abstract:I n recent years, recommender system is one of indispensable components in many e-commerce websites. One of the major challenges that largely remains open is the cold-start problem, which can be viewed as a barrier that keeps the cold-start users/items away from the existing ones. In this paper, we aim to break through this barrier for cold-start users/items by the assistance of existing ones. In particular, inspired by the classic Elo Rating System, which has been widely adopted in chess tournaments; we propose a novel rating comparison strategy (RAPARE) to learn the latent profiles of cold-start users/items. The center-piece of our RAPARE is to provide a fine-grained calibration on the latent profiles of cold-start users/items by exploring the differences between cold-start and existing users/items. As a generic strategy, our proposed strategy can be instantiated into existing methods in recommender systems. To reveal the capability of RAPARE strategy, we instantiate our strategy on two prevalent methods in recommender systems, i.e., the matrix factorization based and neighborhood based collaborative filtering.

IEEE 2017: l-Injection: Toward Effective Collaborative Filtering Using Uninteresting Items
Abstract: We develop a novel framework, named as l-injection, to address the sparsity problem of recommender systems. By carefully injecting low values to a selected set of unrated user-item pairs in a user-item matrix, we demonstrate that top-N recommendation accuracies of various collaborative filtering (CF) techniques can be significantly and consistently improved. We first adopt the notion of pre-use preferences of users toward a vast amount of unrated items. Using this notion, we identify uninteresting items that have not been rated yet but are likely to receive low ratings from users, and selectively impute them as low values. As our proposed approach is method-agnostic, it can be easily applied to a variety of CF algorithms. Through comprehensive experiments with three real-life datasets (e.g., Movielens, Ciao, and Watcha), we demonstrate that our solution consistently and universally enhances the accuracies of existing CF algorithms (e.g., item-based CF, SVD-based CF, and SVD++) by 2.5 to 5 times on average. Furthermore, our solution improves the running time of those CF methods by 1.2 to 2.3 times when its setting produces the best accuracy.
Click for more details


IEEE 2017: Vehicular Cloud Data Collection for Intelligent Transportation Systems

Abstract:  The Internet of Things (IoT) envisions connecting billions of sensors to the Internet, in order to provide new applications and services for smart cities. IoT will allow the evolution of the Internet of Vehicles (IoV) from existing Vehicular Ad hoc Networks (VANETs), in which the delivery of various services will be offered to drivers by integrating vehicles, sensors, and mobile devices into a global network. To serve VANET with computational resources, Vehicular Cloud Computing (VCC) is recently envisioned with the objective of providing traffic solutions to improve our daily driving. These solutions involve applications and services for the benefit of Intelligent Transportation Systems (ITS), which represent an important part of IoV. Data collection is an important aspect in ITS, which can effectively serve online travel systems with the aid of Vehicular Cloud (VC). In this paper, we involve the new paradigm of VCC to propose a data collection model for the benefit of ITS. We show via simulation results that the participation of low percentage of vehicles in a dynamic VC is sufficient to provide meaningful data collection.
Click for more details


IEEE 2017: Optimizing Green Energy, Cost, and Availability in Distributed Data Centers
Abstract:  Integrating renewable energy and ensuring high availability are among two major requirements for geodistributed data centers. Availability is ensured by provisioning spare capacity across the data centers to mask data center failures (either partial or complete). We propose a mixed integer linear programming formulation for capacity planning while minimizing the total cost of ownership (TCO) for highly available, green, distributed data centers. We minimize the cost due to power consumption and server deployment, while targeting a minimum usage of green energy. Solving our model shows that capacity provisioning considering green energy integration, not only lowers carbon footprint but also reduces the TCO. Results show that up to 40% green energy usage is feasible with marginal increase in the TCO compared to the other cost-aware models.



IEEE 2016 : SPORE : A Sequential Personalized Spatial Item Recommender System

AbstractWith the rapid development of location-based social networks (LBSNs), spatial item recommendation has become an important way of helping users discover interesting locations to increase their engagement with location-based services. Although human movement exhibits sequential patterns in LBSNs, most current studies on spatial item recommendations do not consider the sequential influence of locations. Leveraging sequential patterns in spatial item recommendation is, however, very challenging, considering 1) users’ check-in data in LBSNs has a low sampling rate in both space and time, which renders existing prediction techniques on GPS trajectories ineffective; 2) the prediction space is extremely large, with millions of distinct locations as the next prediction target, which impedes the application of classical Markov chain models; and 3) there is no existing framework that unifies users’ personal interests and the sequential influence in a principled manner.In light of the above challenges, we propose a sequential personalized spatial item recommendation framework (SPORE) which introduces a novel latent variable topic-region to model and fuse sequential influence with personal interests in the latent and exponential space. The advantages of modeling the sequential effect at the topic-region level include a significantly reduced prediction space, an effective alleviation of data sparsity and a direct expression of the semantic meaning of users’ spatial activities. Furthermore, we design an asymmetric Locality Sensitive Hashing (ALSH) technique to speed up the online top-k recommendation process by extending the traditional LSH. We evaluate the performance of SPORE on two real datasets and one large-scale synthetic dataset. The results demonstrate a significant improvement in SPORE’s ability to recommend spatial items, in terms of both effectiveness and efficiency, compared with the state-of-the-art methods.

IEEE 2016 : Truth Discovery in Crowdsourced Detection of Spatial Events
Abstract:The ubiquity of smartphones has led to the emergence of mobile crowdsourcing tasks such as the detection of spatial events when smartphone users move around in their daily lives. However, the credibility of those detected events can be negatively impacted by unreliable participants with low-quality data. Consequently, a major challenge in quality control is to discover true events from diverse and noisy participants’ reports. This truth discovery problem is uniquely distinct from its online counterpart in that it involves uncertainties in both participants’ mobility and reliability. Decoupling these two types of uncertainties through location tracking will raise severe privacy and energy issues, whereas simply ignoring missing reports or treating them as negative reports will significantly degrade the accuracy of the discovered truth. In this paper, we propose a new method to tackle this truth discovery problem through principled probabilistic modeling. In particular, we integrate the modeling of location popularity, location visit indicators, truth of events and three-way participant reliability in a unified framework. The proposed model is thus capable of efficiently handling various types of uncertainties and automatically discovering truth without any supervision or the need of location tracking. Experimental results demonstrate that our proposed method outperforms existing state-of-the-art truth discovery approaches in the mobile crowdsourcing environment.

IEEE 2016 : Sentiment Analysis of Top Colleges in India Using Twitter Data
AbstractTtoday’s world, opinions and reviews accessible to us are one of the most critical factors in formulating our views and influencing the success of a brand, product or service. With the advent and growth of social media in the world, stakeholders often take to expressing their opinions on popular social media, namely Twitter. While Twitter data is extremely informative, it presents a challenge for analysis because of its humongous and disorganized nature. This paper is a thorough effort to dive into the novel domain of performing sentiment analysis of people’s opinions regarding top colleges in India. Besides taking additional preprocessing measures like the expansion of net lingo and removal of duplicate tweets, a probabilistic model based on Bayes’ theorem was used for spelling correction, which is overlooked in other research studies. This paper also highlights a comparison between the results obtained by exploiting the following machine learning algorithms: Naïve Bayes and Support Vector Machine and an Artificial Neural Network model: Multilayer Perceptron. Furthermore, a contrast has been presented between four different kernels of SVM: RBF, linear, polynomial and sigmoid.

IEEE 2016 : FRAppE: Detecting Malicious Facebook Applications
Abstract:With 20 million installs a day [1], third-party apps are a major reason for the popularity and addictiveness of Facebook. Unfortunately, hackers have realized the potential of using apps for spreading malware and spam. The problem is already significant, as we find that at least 13% of apps in our dataset are malicious. So far,the research community has focused on detecting malicious posts and campaigns. In this paper, we ask the question: given a Facebook application, can we determine if it is malicious? Our key contribution is in developing FRAppE—Facebook’s Rigorous Application Evaluator— arguably the first tool focused on detecting malicious apps on Facebook. To develop FRAppE, we use information gathered by observing the posting behavior of 111K Facebook apps seen across 2.2 million users on Facebook. First, we identify a set of features that help us distinguish malicious apps from benign ones. For example, we find that malicious apps often share names with other apps, and they typically request fewer permissions than benign apps. Second, leveraging these distinguishing features, we show that FRAppE can detect malicious apps with 99.5% accuracy, with no false positives and a low false negative rate (4.1%). Finally, we explore the ecosystem of malicious Facebook apps and identify mechanisms that these apps use to propagate. Interestingly, we find that many apps collude and support each other; in our dataset, we find 1,584 apps enabling the viral propagation of 3,723 other apps through their posts. Long-term, we see FRAppE as a step towards creating an independent watchdog for app assessment and ranking,so as to warn Facebook users before installing apps.

IEEE 2023: WEB SECURITY OR CYBER CRIME

  IEEE 2023:   Machine Learning and Software-Defined Networking to Detect DDoS Attacks in IOT Networks Abstract:   In an era marked by the r...