Download Solution


1000 words / 3 pages - 40 AUD | 20 Pounds
2000 words / 6 pages - 80 AUD | 40 Pounds

7082 CEM Big Data Management and Data Visualisation

Code- 7082 CEM assignment help Subject- Big Data Management and Data Visualisation assignment help Task: 1. Select a dataset of your choice from one of the open dataset repositories (Kaggle/UCI/others). Students are advised to inform the module leader by email of the dataset they have decided to work on and get approval. 2. Use PySpark (or another Big Data program from the Hadoop Ecosystem) to analyze the dataset. You should perform one or a combination of data analysis tasks (regression, clustering, classification, etc). You should explain your choice of the technique(s) used. 3. Use visualization to show the results of your analysis. You can use either Tableau or another program of your choice. 4. Critically analyze your findings: the results and the methods used. Clarifications:  You can use any operating system that you prefer to install your program. This document is for Coventry University students for their own use in completing their assessed work for this module and should not be passed to third parties or posted on any website. Any infringements of this rule should be reported to  Coding the task you are performing yourself is a plus.  Given the nature of this module and the task, you should document everything you do.  Everything you do should be reproducible. The link to the dataset should be clear (direct link to the dataset not the site where it is hosted). If you use a code from an external source, the link should be clear and direct. If the code is not too long, it is better to include it in the report (in the appendix, or as snippets in the report), or submit it separately with your submission. If you modify a code, the modification should be very clearly indicated (meaning you should show the original part that you modified, and the modification you made). Report Structure: Your report should typically have: o A title. o An introduction in which you briefly describe your project. o An implementation part, in which you should introduce the program you are using (PySpark or another - the description should be more detailed if you use another program from the Hadoop Ecosystem), how it is installed, how it is configured, how it works, the dataset you are applying your program to/the data analysis task you are performing. o A discussion of your findings. o A conclusion. o References. Mark distribution: Technical quality (45 Marks): This aspect concerns the depth of the information presented in the report Difficulty (15 Marks): This aspect concerns the difficulty of the program used or the analysis applied/the complexity of the dataset/applying several data analysis tasks/programming the method by the student himself/herself. Visualization (20 Marks): This aspect concerns the quality of visualization produced Reproducibility (10 Marks): This aspect concerns using screen shots/providing codes used/ clear explanation of the steps taken Style and format (10 Marks) Notes: 1. You are expected to use the Coventry University APA style for referencing. For support and advice on this students can contact Centre for Academic Writing (CAW). 2. Please notify your registry course support team and module leader for disability support. This document is for Coventry University students for their own use in completing their assessed work for this module and should not be passed to third parties or posted on any website. Any infringements of this rule should be reported to 3. Any student requiring an extension or deferral should follow the university process as outlined here. 4. The University cannot take responsibility for any coursework lost or corrupted on disks, laptops or personal computer. Students should therefore regularly back-up any work and are advised to save it on the University system. 5. If there are technical or performance issues that prevent students submitting coursework through the online coursework submission system on the day of a coursework deadline, an appropriate extension to the coursework submission deadline will be agreed. This extension will normally be 24 hours or the next working day if the deadline falls on a Friday or over the weekend period. This will be communicated via your Module Leader. 6. You are encouraged to check the originality of your work by using the draft Turnitin links on Aula. 7. Collusion between students (where sections of your work are similar to the work submitted by other students in this or previous module cohorts) is taken extremely seriously and will be reported to the academic conduct panel. This applies to both courseworks and exam answers. 8. A marked difference between your writing style, knowledge and skill level demonstrated in class discussion, any test conditions and that demonstrated in a coursework assignment may result in you having to undertake a Viva Voce in order to prove the coursework assignment is entirely your own work. 9. If you make use of the services of a proof reader in your work you must keep your original version and make it available as a demonstration of your written efforts. 10. You must not submit work for assessment that you have already submitted (partially or in full), either for your current course or for another qualification of this university, with the exception of resits, where for the coursework, you maybe asked to rework and improve a previous attempt. This requirement will be specifically detailed in your assignment brief or specific course or module information. Where earlier work by you is citable, i.e. it has already been published/submitted, you must reference it clearly. Identical pieces of work submitted concurrently may also be considered to be self-plagiarism. Mark allocation

Share this post

Order Now