Thursday, 28 December 2017

IEEE 2018: A Data Mining based Model for Detection of Fraudulent Behaviour in Water Consumption

Abstract: Fraudulent behavior in drinking water consumption is a significant problem facing water supplying companies and agencies. This behavior results in a massive loss of income and forms the highest percentage of non-technical loss. Finding efficient measurements for detecting fraudulent activities has been an active research area in recent years. Intelligent data mining techniques can help water supplying companies to detect these fraudulent activities to reduce such losses. This research explores the use of two classification techniques (SVM and KNN) to detect suspicious fraud water customers. The main motivation of this research is to assist Yarmouk Water Company (YWC) in Irbid city of Jordan to overcome its profit loss. The SVM based approach uses customer load profile attributes to expose abnormal behavior that is known to be correlated with non-technical loss activities. The data has been collected from the historical data of the company billing system. The accuracy of the generated model hit a rate of over 74% which is better than the current manual prediction procedures taken by the YWC. To deploy the model, a decision tool has been built using the generated model. The system will help the company to predict suspicious water customers to be inspected on site.

Friday, 22 December 2017

IEEE 2018: An Efficient and Privacy-Preserving Biometric Identification Scheme in Cloud Computing

Abstract: Biometric identification has become increasingly popular in recent years. With the development of cloud computing, database owners are motivated to outsource the large size of biometric data and identification tasks to the cloud to get rid of the expensive storage and computation costs, which, however, brings potential threats to users' privacy. In this paper, we propose an efficient and privacy-preserving bio-metric identification outsourcing scheme. Speciacally, the biometric to execute a biometric identification, the database owner encrypts the query data and submits it to the cloud. The cloud performs identification operations over the encrypted database and returns the result to the database owner. A thorough security analysis indicates that the proposed scheme is secure even if attackers can forge identification requests and collude with the cloud. Compared with previous protocols, experimental results show that the proposed scheme achieves a better performance in both preparation and identification procedures.

Wednesday, 22 November 2017

IEEE 2018: MEDIBOX – IoT Enabled Patient Assisting Device

Abstract: The health and wellness sector is critical to human society and as such should be one of the first to receive the benefits of upcoming technologies like IoT. Some of the Internet of Medical Things (IoMT) are connected to IoT networks to monitor the day-to-day activities of the patients. Recently there has been attempts to design new medical devices which monitor the medications and help aged people for a better assisted living. In this paper, one such attempt is made to design a multipurpose portable intelligent device named MEDIBOX which helps the patients take their medications at the right time. This box is a proficient system which maintains the parameters like temperature and humidity in a controlled range recommended by the drug manufacturer and thus maintains the potency of the medicines even if the patient is travelling. Related to this, we have developed a Host Management System (HMS) which is capable of cloud-based installation and monitoring that stores and controls the MEDIBOX functionality for further analysis and future modification in design aspects.

Thursday, 16 November 2017

IEEE 2018: Secure Attribute-Based Signature Scheme with Multiple Authorities for Blockchain in Electronic Health Records Systems


Abstract: Electronic Health Records (EHRs) are entirely controlled by hospitals instead of patients, which complicates seeking medical advices from different hospitals. Patients face a critical need to focus on the details of their own healthcare and restore management of their own medical data. The rapid development of blockchain technology promotes population healthcare, including medical records as well as patient-related data. This technology provides patients with comprehensive, immutable records, and access to EHRs free from service providers and treatment websites. In this paper, to guarantee the validity of EHRs encapsulated in blockchain, we present an attribute-based signature scheme with multiple authorities, in which a patient endorses a message according to the attribute while disclosing no information other than the evidence that he has attested to it. Furthermore, there are multiple authorities without a trusted single or central one to generate and distribute public/private keys of the patient, which avoids the escrow problem and conforms t the mode of distributed data storage in the blockchain. By sharing the secret pseudorandom function seeds among authorities, this protocol resists collusion attack out of N from N 􀀀1 corrupted authorities. Under the assumption of the computational bilinear Dif_e-Hellman, we also formally demonstrate that, in terms of the unforgeability and perfect privacy of the attribute-signer, this attribute-based signature scheme is secure in the random oracle model. The comparison shows the ef_ciency and properties between the proposed method and methods proposed in other studies.

Monday, 6 November 2017

IEEE 2018: Malware Threats and Detection for Industrial Mobile-IoT Networks

Abstract: Industrial IoT networks deploy heterogeneous IoT devices to meet a wide range of user requirements. These devices are usually pooled from private or public IoT cloud providers. A signi_cant number of IoT cloud providers integrate smartphones to overcome the latency of IoT devices and low computational power problems. However, the integration of mobile devices with industrial IoT networks exposes the IoT devices to signi_cant malware threats. Mobile malware is the highest threat to the security of IoT data, user's personal information, identity, and corporate/_nancial information. This paper analyzes the efforts regarding malware threats aimed at the devices deployed in industrial mobile-IoT networks and related detection techniques. We considered static, dynamic, and hybrid detection analysis. In this performance analysis, we compared static, dynamic, and hybrid analyses on the basis of data set, feature extraction techniques, feature selection techniques, detection methods, and the accuracy achieved by these methods. Therefore, we identify suspicious API calls, system calls, and the permissions that are extracted and selected as features to detect mobile malware. This will assist application developers in the safe use of APIs when developing applications for industrial IoT networks.

IEEE 2018: Smart parking in the Smart city application

Abstract: Insufficient parking capacities trouble almost every city today. The demand for parking space is considerably higher than the supply and since creating new parking facilities is economically very challenging, it is important to look for ways to make the most of the existing parking space, especially as on-street parking is regarded. The aim is therefore to apply systems for efficient use of existing parking space, focusing in particular on monitoring the occupancy of parking space and providing the information to drivers. A large-scale pilot project was implemented in Uherské Hradiště in the second half of the year 2017. It involved testing of features and subsystems for parking management as well as monitoring the turnover and occupancy of parking spaces in the city. This article describes the course of the pilot project, the employed detection and action elements of the system and also deals with the evaluation and the outcomes of the pilot testing.

IEEE 2018: SkyShield: A Sketch-Based Defense System Against Application Layer DDoS Attacks

Abstract: Application layer distributed denial of service (DDoS) attacks have become a severe threat to the security of web servers. These attacks evade most intrusion prevention systems by sending numerous benign HTTP requests. Since most of these attacks are launched abruptly and severely, a fast intrusion prevention system is desirable to detect and mitigate these attacks as soon as possible. In this paper, we propose an effective defense system, named SkyShield, which leverages the sketch data structure to quickly detect and mitigate application layer DDoS attacks. First, we propose a novel calculation of the divergence between two sketches, which alleviates the impact of network dynamics and improves the detection accuracy. Second, we utilize the abnormal sketch to facilitate the identification of malicious hosts of an ongoing attack. This improves the efficiency of SkyShield by avoiding the reverse calculation of malicious hosts. We have developed a prototype of SkyShield and carefully evaluated its effectiveness using real attack data collected from a large-scale web cluster. The experimental results show that SkyShield can quickly reduce malicious requests, while posing a limited impact on normal users.


IEEE 2018: A proposed approach for preventing Cross-Site Scripting
Abstract: In this paper, the great threat Cross-Site Scripting (XSS) is introduced that faced with the web pages. Because of the impacts of such web threats during design and developing web pages, web developers must be aware and have adequate knowledge about varies type of web attacks and how to prevent or mitigate them. Web developers should have knowledge about how attackers attack websites and exploit weak points on websites during filling forms, registering and opening suspicious links or attachments in emails. The important of this subject is to provide great details and information about identifying impacting and protecting from these types of web threats. It aims to provide both web developers and users with enough knowledge while developing and using websites to prevent from such attacks and reduce them impacting and protecting from these types of web threats. It aims to provide both web developers and users with enough knowledge while developing and using websites to prevent from such attacks and reduce them. In this paper use PHP’s functions to evaluate the efficiency of web pages for implementing it and to prevent XSS attack.

Monday, 9 January 2017

IEEE 2017: Vehicular Cloud Data Collection for Intelligent Transportation Systems

IEEE 2017: Vehicular Cloud Data Collection for Intelligent Transportation Systems
Abstract: The Internet of Things (IoT) envisions connecting billions of sensors to the Internet, in order to provide new applications and services for smart cities. IoT will allow the evolution of the Internet of Vehicles (IoV) from existing Vehicular Ad hoc Networks (VANETs), in which the delivery of various services will be offered to drivers by integrating vehicles, sensors, and mobile devices into a global network. To serve VANET with computational resources, Vehicular Cloud Computing (VCC) is recently envisioned with the objective of providing traffic solutions to improve our daily driving. These solutions involve applications and services for the benefit of Intelligent Transportation Systems (ITS), which represent an important part of IoV. Data collection is an important aspect in ITS, which can effectively serve online travel systems with the aid of Vehicular Cloud (VC). In this paper, we involve the new paradigm of VCC to propose a data collection model for the benefit of ITS. We show via simulation results that the participation of low percentage of vehicles in a dynamic VC is sufficient to provide meaningful data collection.


IEEE 2017: Optimizing Green Energy, Cost, and Availability in Distributed Data Centers
Abstract: Integrating renewable energy and ensuring high availability are among two major requirements for geodistributed data centers. Availability is ensured by provisioning spare capacity across the data centers to mask data center failures (either partial or complete). We propose a mixed integer linear programming formulation for capacity planning while minimizing the total cost of ownership (TCO) for highly available, green, distributed data centers. We minimize the cost due to power consumption and server deployment, while targeting a minimum usage of green energy. Solving our model shows that capacity provisioning considering green energy integration, not only lowers carbon footprint but also reduces the TCO. Results show that up to 40% green energy usage is feasible with marginal increase in the TCO compared to the other cost-aware models.


IEEE 2017: A Collision-Mitigation Cuckoo Hashing Scheme for Large-scale Storage Systems
Abstract: With the rapid growth of the amount of information, cloud computing servers need to process and analyze large amounts of high-dimensional and unstructured data timely and accurately. This usually requires many query operations. Due to simplicity and ease of use, cuckoo hashing schemes have been widely used in real-world cloud-related applications. However due to the potential hash collisions, the cuckoo hashing suffers from endless loops and high insertion latency, even high risks of re-construction of entire hash table. In order to address these problems, we propose a cost-efficient cuckoo hashing scheme, called MinCounter. The idea behind MinCounter is to alleviate the occurrence of endless loops in the data insertion by selecting unbusy kicking-out routes. MinCounter selects the “cold” (infrequently accessed), rather than random, buckets to handle hash collisions. We further improve the concurrency of the MinCounter scheme to pursue higher performance and adapt to concurrent applications. MinCounter has the salient features of offering efficient insertion and query services and delivering high performance of cloud servers, as well as enhancing the experiences for cloud users. We have implemented MinCounter in a large-scale cloud test bed and examined the performance by using three real-world traces. Extensive experimental results demonstrate the efficacy and efficiency of MinCounter.


Sunday, 8 January 2017

IEEE 2018: Human Identification From Freestyle Walks Using Posture-Based Gait Feature

Abstract: With the increase of terrorist threats around the world, human identification research has become a sought after area of research. Unlike standard biometric recognition techniques, gait recognition is a non-intrusive technique. Both data collection and classification processes can be done without a subject’s cooperation. In this work, we propose a new model-based gait recognition technique called postured-based gait recognition. It consists of two elements: posture-based features and posture-based classification. Posture-based features are composed of displacements of all joints between current and adjacent frames and Center-of-Body (CoB) relative coordinates of all joints, where the coordinates of each joint come from its relative position to four joints: hip-center, hip-left, hip-right, and spine joints, from the front forward. The CoB relative coordinate system is a critical part to handle the different observation angle issue. In posture-based classification, postured-based gait features of all frames are considered. The dominant subject becomes a classification result. The postured-based gait recognition technique outperforms existing techniques in both fixed direction and freestyle walk scenarios where turning around and changing directions are involved. This suggests that a set of postures and quick movements are sufficient to identify a person. The proposed technique also performs well under the gallery-size test and the cumulative match characteristic test, which implies that the postured-based gait recognition technique is not gallery-size sensitive and is a good potential tool for forensic and surveillance use.

Thursday, 5 January 2017

IEEE 2016 : Practical Approximate k Nearest Neighbor Queries with Location and Query Privacy



IEEE 2016 Transaction on Data Mining

Abstract:In mobile communication, spatial queries pose a serious threat to user location privacy because the location of a query may reveal sensitive information about the mobile user. In this paper, we study approximate k nearest neighbor (kNN) queries where the mobile user queries the location-based service (LBS) provider about approximate k nearest points of interest (POIs) on the basis of his current location. We propose a basic solution and a generic solution for the mobile user to preserve his location and query privacy in approximate kNN queries. The proposed solutions are mainly built on the Paillier public-key cryptosystem and can provide both location and query privacy. To preserve query privacy, our basic solution allows the mobile user to retrieve one type of POIs, for example, approximate k nearest car parks, without revealing to the LBS provider what type of points is retrieved. Our generic solution can be applied to multiple discrete type attributes of private location-based queries. Compared with existing solutions for kNN queries with location privacy, our solution is more efficient. Experiments have shown that our solution is practical for kNN queries.


Wednesday, 4 January 2017

IEEE 2016 : PassBYOP: Bring Your Own Picture for Securing Graphical Passwords


IEEE 2016 Transaction on Image Processing

Abstract: PassBYOP is a new graphical password scheme for public terminals that replaces the static digital images typically used in graphical password systems with personalized physical tokens, herein in the form of digital pictures displayed on a physical user-owned device such as a mobile phone. Users present these images to a system camera and then enter their password as a sequence of selections on live video of the token. Highly distinctive optical features are extracted from these selections and used as the password.We present three feasibility studies of PassBYOP examining its reliability, usability, and security against observation. The reliability study shows that image-feature based passwords are viable and suggests appropriate system thresholds—password items should contain a minimum of seven features, 40% of which must geometrically match originals stored on an authentication server in order to be judged equivalent. The usability study measures task completion times and error rates, revealing these to be 7.5 s and 9%, broadly comparable with prior graphical password systems that use static digital images. Finally, the security study highlights PassBYOP’s resistance to observation attack—three attackers are unable to compromise a password using shoulder surfing, camera based observation, or malware. These results indicate that Pass- BYOP shows promise for security while maintaining the usabilityof current graphical password schemes.



IEEE 2016 : Single-sample Face Recognition Based on LPP Feature Transfer

IEEE 2016 Transaction on Image Processing
Abstract:Due to its wide applications in practice, face recognition has been an active research topic. With the availability of adequate training samples, many machine learning methods could yield high face recognition accuracy. However, under the circumstance of inadequate training samples, especially the extreme case of having only a single training sample, face recognition becomes challenging. How to deal with conflicting concerns of the small sample size and high dimensionality in one-sample face recognition is critical for its achievable recognition accuracy and feasibility in practice. Being different from conventional methods for global face recognition based on generalization ability promotion and local face recognition depending on image segmentation, a single-sample face recognition algorithm based on Locality Preserving Projection (LPP) feature transfer is proposed here. First, transfer sources are screened to obtain the selective sample source using the whitened cosine similarity metric. Secondly, we project the vectors of source faces and target faces into feature sub-space by LPP respectively, and calculate the feature transfer matrix to approximate the mapping relationship on source faces and target faces in subspace. Then, the feature transfer matrix is used on training samples to transfer the original macro characteristics to target macro characteristics. Finally, the nearest neighbor classifier is used for face recognition. Our results based on popular databases FERET, ORL and Yale demonstrate the superiority of the proposed LPP feature transfer based one-sample face recognition algorithm when compared with popular single-sample face recognition algorithms such as (PC)2A and Block FLDA.



IEEE 2016 :  Reversible Data Hiding in Encrypted Image with Distributed Source Encoding

IEEE 2016 Transaction on Image Processing
Abstract:This paper proposes a novel scheme of reversible data hiding (RDH) in encrypted images using distributed source coding (DSC). After the original image is encrypted by the content owner using a stream cipher, the data-hider compresses a series of selected bits taken from the encrypted image to make room for the secret data. The selected bit series is Slepian-Wolf encoded using low density parity check (LDPC) codes. On the receiver side, the secret bits can be extracted if the image receiver has the embedding key only. In case the receiver has the encryption key only, he/she can recover the original image approximately with high quality using an image estimation algorithm. If the receiver has both the embedding and encryption keys, he/she can extract the secret data and perfectly recover the original image using the distributed source decoding. The proposed method outperforms previously published ones.sine similarity metric. Secondly, we project the vectors of source faces and target faces into feature sub-space by LPP respectively, and calculate the feature transfer matrix to approximate the mapping relationship on source faces and target faces in subspace. Then, the feature transfer matrix is used on training samples to transfer the original macro characteristics to target macro characteristics. Finally, the nearest neighbor classifier is used for face recognition. Our results based on popular databases FERET, ORL and Yale demonstrate the superiority of the proposed LPP feature transfer based one-sample face recognition algorithm when compared with popular single-sample face recognition algorithms such as (PC)2A and Block FLDA.



IEEE 2016 :  A Shoulder Surfing Resistant Graphical Authentication System

IEEE 2016 Transaction on Image Processing
Abstract:Authentication based on passwords is used largely in applications for computer security and privacy. However, human actions such as choosing bad passwords and inputting passwords in an insecure way are regarded as ”the weakest link” in the authentication chain. Rather than arbitrary alphanumeric strings, users tend to choose passwords either short or meaningful for easy memorization. With web applications and mobile apps piling up, people can access these applications anytime and anywhere with various devices. This evolution brings great convenience but also increases the probability of exposing passwords to shoulder surfing attacks. Attackers can observe directly or use external recording devices to collect users’ credentials. To overcome this problem, we proposed a novel authentication system PassMatrix, based on graphical passwords to resist shoulder surfing attacks. With a one-time valid login indicator and circulative horizontal and vertical bars covering the entire scope of pass-images, PassMatrix offers no hint for attackers to figure out or narrow down the password even they conduct multiple camera-based attacks. We also implemented a PassMatrix prototype on Android and carried out real user experiments to evaluate its memorability and usability. From the experimental result, the proposed system achieves better resistance to shoulder surfing attacks while maintaining usability.




IEEE 2016 : STAMP: Enabling Privacy-Preserving Location Proofs for Mobile Users


IEEE 2016 Transaction on Networking

Abstract:Location-based services are quickly becoming immensely popular. In addition to services based on users' current location, many potential services rely on users' location history, or their spatial-temporal provenance. Malicious users may lie about their spatial-temporal provenance without a carefully designed security system for users to prove their past locations. In this paper, we present the Spatial-Temporal provenance Assurance with Mutual Proofs (STAMP) scheme. STAMP is designed for ad-hoc mobile users generating location proofs for each other in a distributed setting. However, it can easily accommodate trusted mobile users and wireless access points. STAMP ensures the integrity and non-transferability of the location proofs and protects users' privacy. A semi-trusted Certification Authority is used to distribute cryptographic keys as well as guard users against collusion by a light-weight entropy-based trust evaluation approach. Our prototype implementation on the Android platform shows that STAMP is low-cost in terms of computational and storage resources. Extensive simulation experiments show that our entropy-based trust model is able to achieve high collusion detection accuracy.



IEEE 2016 : FRAppE: Detecting Malicious Facebook Applications

IEEE 2016 Transaction on Networking

Abstract:With 20 million installs a day [1], third-party apps are a major reason for the popularity and addictiveness of Facebook. Unfortunately, hackers have realized the potential of using apps for spreading malware and spam. The problem is already significant, as we find that at least 13% of apps in our dataset are malicious. So far, the research community has focused on detecting malicious posts and campaigns.In this paper, we ask the question: given a Facebook application,can we determine if it is malicious? Our key contribution is in developing FRAppE—Facebook’s Rigorous Application Evaluator—arguably the first tool focused on detecting malicious apps on Facebook. To develop FRAppE, we use information gathered by observing the posting behavior of 111K Facebook apps seen across 2.2 million users on Facebook. First, we identify a set of features that help us distinguish malicious apps from benign ones. For example, we find that malicious apps often share names with other apps, and they typically request fewer permissions than benign apps. Second, leveraging these distinguishing features, we show that FRAppE can detect malicious apps with 99.5% accuracy, with no false positives and a low false negative rate (4.1%). Finally, we explore the ecosystem of malicious Facebook apps and identify mechanisms that these apps use to propagate. Interestingly, we find that many apps collude and support each other; in our dataset, we find 1,584 apps enabling the viral propagation of 3,723 other apps through their posts. Long-term, we see FRAppE as a step towards creating an independent watchdog for app assessment and ranking, so as to warn Facebook users before installing apps.


IEEE 2016 : Toward Optimum Crowdsensing Coverage With Guaranteed Performance

IEEE 2016 Transaction on Networking

Abstract:Mobile crowdsensing networks have emerged to show elegant data collection capability in loosely cooperative network. However, in the sense of coverage quality, marginal works have considered the efficient (less participants) and effective (more coverage) designs for mobile crowdsensing network. We investigate the optimal coverage problem in distributed crowdsensing networks. In that, the sensing quality and the information delivery are jointly considered. Different from the conventional coverage problem, ours only select a subset of mobile users, so as to maximize the crowdsensing coverage with limited budget. We formulate our concerns as an optimal crowdsensing coverage problem, and prove its NP-completeness. In tackling this difficulty, we also prove the submodular property in our problem. Leveraging the favorable property in submodular optimization, we present the greedy algorithm with approximationratio O(√k), where k is the number of selected users. Such that the information delivery and sensing coverage ratio could be guaranteed. Finally, we make extensive evaluations for the proposed scheme, with trace-driven tests. Evaluation results show that the proposed scheme could outperform the random selection by 2× with a random walk model, and over 3× with real trace data, in terms of crowdsensing coverage. Besides, the proposed scheme achieves near optimal solution comparing with the bruteforce search results.

Tuesday, 3 January 2017

IEEE 2016 : Inverted Linear Quadtree Efficient TopKSpatial Keyword Search

IEEE 2016 Transaction on Data Mining

Abstract:With advances in geo-positioning technologies and geo-location services, there are a rapidly growing amount of spatiotextual objects collected in many applications such as location based services and social networks, in which an object is described by its spatial location and a set of keywords (terms). Consequently, the study of spatial keyword search which explores both location and textual description of the objects has attracted great attention from the commercial organizations and research communities. In the paper, we study two fundamental problems in the spatial keyword queries: top k spatial keyword search (TOPK-SK), and batch top k spatial keyword search (BTOPK-SK). Given a set of spatio-textual objects, a query location and a set of query keywords, the TOPK-SK retrieves the closest k objects each of which contains all keywords in the query. BTOPK-SK is the batch processing of sets of TOPK-SK queries. Based on the inverted index and the linear quadtree, we propose a novel index structure, called inverted linear quadtree (IL- Quadtree), which is carefully designed to exploit both spatial and keyword based pruning techniques to effectively reduce the search space. An efficient algorithm is then developed to tackle top k spatial keyword search. To further enhance the filtering capability of the signature of linear quadtree, we propose a partition based method. In addition, to deal with BTOPK-SK, we design a new computing paradigm which partition the queries into groups based on both spatial proximity and the textual relevance between queries. We show that the IL-Quadtree technique can also efficiently support BTOPK-SK. Comprehensive experiments on real and synthetic data clearly demonstrate the efficiency of our methods.



IEEE 2016 : Truth Discovery in Crowdsourced Detection of Spatial Events

IEEE 2016 Transaction on Data Mining
Abstract:The ubiquity of smartphones has led to the emergence of mobile crowdsourcing tasks such as the detection of spatial events when smartphone users move around in their daily lives. However, the credibility of those detected events can be negatively impacted by unreliable participants with low-quality data. Consequently, a major challenge in quality control is to discover true events from diverse and noisy participants’ reports. This truth discovery problem is uniquely distinct from its online counterpart in that it involves uncertainties in both participants’ mobility and reliability. Decoupling these two types of uncertainties through location tracking will raise severe privacy and energy issues, whereas simply ignoring missing reports or treating them as negative reports will significantly degrade the accuracy of the discovered truth. In this paper, we propose a new method to tackle this truth discovery problem through principled probabilistic modeling. In particular, we integrate the modeling of location popularity, location visit indicators, truth of events and three-way participant reliability in a unified framework. The proposed model is thus capable of efficiently handling various types of uncertainties and automatically discovering truth without any supervision or the need of location tracking. Experimental results demonstrate that our proposed method outperforms existing state-of-the-art truth discovery approaches in the mobile crowdsourcing environment.



Monday, 2 January 2017

IEEE 2016 : SPORE :A Sequential Personalized Spatial Item Recommender System

IEEE 2016 Transaction on Data Mining
Abstract:With the rapid development of location-based social networks (LBSNs), spatial item recommendation has become an important way of helping users discover interesting locations to increase their engagement with location-based services. Although human movement exhibits sequential patterns in LBSNs, most current studies on spatial item recommendations do not consider the sequential influence of locations. Leveraging sequential patterns in spatial item recommendation is, however, very challenging, considering 1) users’ check-in data in LBSNs has a low sampling rate in both space and time, which renders existing prediction techniques on GPS trajectories ineffective; 2) the prediction space is extremely large, with millions of distinct locations as the next prediction target, which impedes the application of classical Markov chain models; and 3) there is no existing framework that unifies users’ personal interests and the sequential influence in a principled manner.In light of the above challenges, we propose a sequential personalized spatial item recommendation framework (SPORE) which introduces a novel latent variable topic-region to model and fuse sequential influence with personal interests in the latent and exponential space. The advantages of modeling the sequential effect at the topic-region level include a significantly reduced prediction space, an effective alleviation of data sparsity and a direct expression of the semantic meaning of users’ spatial activities. Furthermore, we design an asymmetric Locality Sensitive Hashing (ALSH) technique to speed up the online top-k recommendation process by extending the traditional LSH. We evaluate the performance of SPORE on two real datasets and one large-scale synthetic dataset. The results demonstrate a significant improvement in SPORE’s ability to recommend spatial items, in terms of both effectiveness and efficiency, compared with the state-of-the-art methods.


IEEE 2016 : Truth Discovery in Crowdsourced Detection of Spatial Events


IEEE 2016 Transaction on Data Mining

Abstract:The ubiquity of smartphones has led to the emergence of mobile crowdsourcing tasks such as the detection of spatial events when smartphone users move around in their daily lives. However, the credibility of those detected events can be negatively impacted by unreliable participants with low-quality data. Consequently, a major challenge in quality control is to discover true events from diverse and noisy participants’ reports. This truth discovery problem is uniquely distinct from its online counterpart in that it involves uncertainties in both participants’ mobility and reliability. Decoupling these two types of uncertainties through location tracking will raise severe privacy and energy issues, whereas simply ignoring missing reports or treating them as negative reports will significantly degrade the accuracy of the discovered truth. In this paper, we propose a new method to tackle this truth discovery problem through principled probabilistic modeling. In particular, we integrate the modeling of location popularity, location visit indicators, truth of events and three-way participant reliability in a unified framework. The proposed model is thus capable of efficiently handling various types of uncertainties and automatically discovering truth without any supervision or the need of location tracking. Experimental results demonstrate that our proposed method outperforms existing state-of-the-art truth discovery approaches in the mobile crowdsourcing environment.



IEEE 2016 : Sentiment Analysis of Top Colleges in India Using Twitter Data

IEEE 2016 Transaction on Data Mining
Abstract:In today’s world, opinions and reviews accessible to us are one of the most critical factors in formulating our views and influencing the success of a brand, product or service. With the advent and growth of social media in the world, stakeholders often take to expressing their opinions on popular social media, namely Twitter. While Twitter data is extremely informative, it presents a challenge for analysis because of its humongous and disorganized nature. This paper is a thorough effort to dive into the novel domain of performing sentiment analysis of people’s opinions regarding top colleges in India. Besides taking additional preprocessing measures like the expansion of net lingo and removal of duplicate tweets, a probabilistic model based on Bayes’ theorem was used for spelling correction, which is overlooked in other research studies. This paper also highlights a comparison between the results obtained by exploiting the following machine learning algorithms: Naïve Bayes and Support Vector Machine and an Artificial Neural Network model: Multilayer Perceptron. Furthermore, a contrast has been presented between four different kernels of SVM: RBF, linear, polynomial and sigmoid.


IEEE 2016 : FRAppE: Detecting Malicious Facebook Applications

IEEE 2016 Transaction on Data Mining
Abstract:With 20 million installs a day [1], third-party apps are a major reason for the popularity and addictiveness of Facebook. Unfortunately, hackers have realized the potential of using apps for spreading malware and spam. The problem is already significant, as we find that at least 13% of apps in our dataset are malicious. So far,the research community has focused on detecting malicious posts and campaigns. In this paper, we ask the question: given a Facebook application, can we determine if it is malicious? Our key contribution is in developing FRAppE—Facebook’s Rigorous Application Evaluator— arguably the first tool focused on detecting malicious apps on Facebook. To develop FRAppE, we use information gathered by observing the posting behavior of 111K Facebook apps seen across 2.2 million users on Facebook. First, we identify a set of features that help us distinguish malicious apps from benign ones. For example, we find that malicious apps often share names with other apps, and they typically request fewer permissions than benign apps. Second, leveraging these distinguishing features, we show that FRAppE can detect malicious apps with 99.5% accuracy, with no false positives and a low false negative rate (4.1%). Finally, we explore the ecosystem of malicious Facebook apps and identify mechanisms that these apps use to propagate. Interestingly, we find that many apps collude and support each other; in our dataset, we find 1,584 apps enabling the viral propagation of 3,723 other apps through their posts. Long-term, we see FRAppE as a step towards creating an independent watchdog for app assessment and ranking,so as to warn Facebook users before installing apps.



Thursday, 29 December 2016

IEEE 2016: Secure Optimization Computation Outsourcing in Cloud Computing: A Case Study of Linear Programming




IEEE 2016 Transaction on Cloud Computing

AbstractCloud computing enables an economically promising paradigm of computation outsourcing. However, how to protect customers confidential data processed and generated during the computation is becoming the major security concern. Focusing on engineering computing and optimization tasks, this paper investigates secure outsourcing of widely applicable linear programming (LP) computations. Our mechanism design explicitly decomposes LP computation outsourcing into public LP solvers running on the cloud and private LP parameters owned by the customer. The resulting flexibility allows us to explore appropriate security/efficiency tradeoff via higher-level abstraction of LP computation than the general circuit representation. Specifically, by formulating private LP problem as a set of matrices/vectors, we develop efficient privacy-preserving problem transformation techniques, which allow customers to transform the original LP into some random one while protecting sensitive input/output information. To validate the computation result, we further explore the fundamental duality theorem of LP and derive the necessary and sufficient conditions that correct results must satisfy. Such result verification mechanism is very efficient and incurs close-to-zero additional cost on both cloud server and customers. Extensive security analysis and experiment results show the immediate practicability of our mechanism design.




IEEE 2016: On Traffic-Aware Partition and Aggregation in MapReduce for Big Data Applications

IEEE 2016 Transaction on Cloud Computing
Abstract The MapReduce programming model simplifies large-scale data processing on commodity cluster by exploiting parallel map tasks and reduce tasks. Although many efforts have been made to improve the performance of MapReduce jobs, they ignore the network traffic generated in the shuffle phase, which plays a critical role in performance enhancement. Traditionally, a hash function is used to partition intermediate data among reduce tasks, which, however, is not traffic-efficient because network topology and data size associated with each key are not taken into consideration. In this paper, we study to reduce network traffic cost for a MapReduce job by designing a novel intermediate data partition scheme. Furthermore, we jointly consider the aggregator placement problem, where each aggregator can reduce merged traffic from multiple map tasks. A decomposition-based distributed algorithm is proposed to deal with the large-scale optimization problem for big data application and an online algorithm is also designed to adjust data partition and aggregation in a dynamic manner. Finally, extensive simulation results demonstrate that our proposals can significantly reduce network traffic cost under both offline and online cases.




IEEE 2016: DeyPoS: Deduplicatable Dynamic Proof of Storage for Multi-User Environments

IEEE 2016 Transaction on Cloud Computing
Abstract—Dynamic Proof of Storage (PoS) is a useful cryptographic primitive that enables a user to check the integrity of out sourced files and to efficiently update the files in a cloud server. Although researchers have proposed many dynamic PoS schemes in single user environments, the problem in multi-user environments has not been investigated sufficiently. A practical multi-user cloud storage system needs the secure client-side cross-user deduplication technique, which allows a user to skip the uploading process and obtain the ownership of the files immediately, when other owners of the same files have uploaded them to the cloud server. To the best of our knowledge, none of the existing dynamic PoSs can support this technique. In this paper, we introduce the concept of deduplicatable dynamic proof of storage and propose an efficient construction called DeyPoS, to achieve dynamic PoS and secure cross-user deduplication, simultaneously. Considering the challenges of structure diversity and private tag generation, we exploit a novel tool called Homomorphic Authenticated Tree (HAT). We prove the security of our construction, and the theoretical analysis and experimental results show that our construction is efficient in practice.




IEEE 2016: Fine-Grained Two-Factor Access Control for Web-Based Cloud Computing Services

IEEE 2016 Transaction on Cloud Computing
Abstract In this paper, we introduce a new fine-grained two-factor authentication (2FA) access control system for web-based cloud computing services. Specifically, in our proposed 2FA access control system, an attribute-based access control  mechanism is implemented with the necessity of both a user secret key and a lightweight security device. As a user cannot access the system if they do not hold both, the mechanism can enhance the security of the system, especially in those scenarios where many users share the same computer for web-based cloud services. In addition, attribute-based control in the system also enables the cloud server to restrict the access to those users with the same set of attributes while preserving user privacy, i.e., the cloud server only knows that the user fulfills the required predicate, but has no idea on the exact identity of the user. Finally, we also carry out a simulation to demonstrate the practicability of our proposed 2FA system.