COMP8210/COMP7210 Big Data Technologies
Code -COMP8210/COMP7210 assignment help
Subject- Big Data Technologies assignment help
Background. Social data analytics have become a vital asset for organizations and governments. For example,
over the last few years, governments started to extract knowledge and derive insights from vastly growing
open/social data to personalize the advertisements in elections, improve government services, predict
intelligence activities, as well as to improve national security and public health. A key challenge in analyzing
social data is to transform the raw data generated by social actors into curated data, i.e., contextualized data
and knowledge that is maintained and made available for use by end-users and applications.
In this assignment you will explore Big Data Technologies for analysing the data generated on social networks.
Reference. Beheshti et al., "DataSynapse: A Social Data Curation Foundry". Distributed Parallel Databases
37(3): 351-384 (2019). Download: https://doi.org/10.1007/s10619-018-7245-1
Dataset. The Twitter dataset, including 10k tweets, is available on iLearn.
Twitter1 serves many objects as JSON2, including Tweets and Users. These objects all encapsulate core
attributes that describe the object. Each Tweet has an author, a message, a unique ID, a timestamp of when it
was posted, and sometimes geo metadata shared by the user. Each User has a Twitter name, an ID, a number
of followers, and most often an account bio.
With each Tweet, Twitter generates 'entity' objects, which are arrays of common Tweet contents such as
hashtags, mentions, media, and links. If there are links, the JSON payload can also provide metadata such as
the fully unwound URL and the webpage’s title and description.