A Reference Architecture and Road map for Enabling E-commerce on Apache Spark

Mohit Sewak; Sachchidanand Singh

Call for Paper

April Edition

CAE solicits high quality original research papers for the upcoming April edition of the journal. The last date of research paper submission is 30 March 2026

Submit your paper

Know more

The week's pick

Machine Learning Models, Data Preprocessing Techniques and Suite of Metrics for Assessing Solar Power Forecasting: A Comprehensive Review

Asma A.M. Nagaraja

Random Articles

Reseach Article

A Reference Architecture and Road map for Enabling E-commerce on Apache Spark

by Mohit Sewak, Sachchidanand Singh

Communications on Applied Electronics

Foundation of Computer Science (FCS), NY, USA

Volume 2 - Number 1

Year of Publication: 2015

Authors: Mohit Sewak, Sachchidanand Singh

10.5120/cae-1651

Mohit Sewak, Sachchidanand Singh . A Reference Architecture and Road map for Enabling E-commerce on Apache Spark. Communications on Applied Electronics. 2, 1 ( June 2015), 37-42. DOI=10.5120/cae-1651

@article{ 10.5120/cae-1651,

author = { Mohit Sewak, Sachchidanand Singh },

title = { A Reference Architecture and Road map for Enabling E-commerce on Apache Spark },

journal = { Communications on Applied Electronics },

issue_date = { June 2015 },

volume = { 2 },

number = { 1 },

month = { June },

year = { 2015 },

issn = { 2394-4714 },

pages = { 37-42 },

numpages = {9},

url = { https://www.caeaccess.org/archives/volume2/number1/365-1651/ },

doi = { 10.5120/cae-1651 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2023-09-04T19:40:32.980124+05:30

%A Mohit Sewak

%A Sachchidanand Singh

%T A Reference Architecture and Road map for Enabling E-commerce on Apache Spark

%J Communications on Applied Electronics

%@ 2394-4714

%V 2

%N 1

%P 37-42

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Apache Spark is an execution engine that besides working as an isolated distributed, in-memory computing engine also offers close integration with Hadoop's distributed file system (HDFS). Apache Spark's underlying appeal is in providing a unified framework to create sophisticated applications involving workloads. It unifies multiple workloads, handles unstructured data very well and has easy-to-use APIs. Apache Spark also offers a streaming component called Spark Streaming, which can write the streamed data in the same data structures, also resides in-memory and can also be read by the Spark's Spark SQL component running on top of core Spark framework. Apache Spark has the ability to provide online machine learning, through its MLlib, and SparkR sub projects. With these, besides streaming data it can also execute machine-learning libraries, functions or algorithms. This paper analyzes Apache Spark and highlights the role of Apache Spark (and eco-system) in the architecture of a modern E-commerce platform. This paper also aims to propose horizontally and vertically scalable reference architectures for both small and medium (SME) & large E-commerce enterprises.

References

Gartner Says India eCommerce Market To Reach $6 Billion in 2015, http://www. gartner. com/newsroom/id/2876517
US eCommerce Forecast: 2013 To 2018, https://www. forrester. com/US+eCommerce+Forecast+2013+To+2018/fulltext/-/E-RES115513
Finding a Spark at Yahoo! http://blogs. gartner. com/nick-heudecker/finding-a-spark-at-yahoo/
MapR announces Apache Drill and Apache Spark integration, http://www. itwire. com/it-industry-news/development/65714-mapr-announces-apache-drill-and-apache-spark-integration
Hortonworks Invests In Spark On Hadoop, http://www. informationweek. com/big-data/big-data-analytics/hortonworks-invests-in-spark-on-hadoop/d/d-id/1316035
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop, https://www. mapr. com/blog/let-spark-fly-advantages-and-use-cases-spark-hadoop-webinar-follow#. VX7eBGMpldF
Cloudera Offers Apache Spark For Hadoop Big Data, http://google. com/newsstand/s/CBIwxYu3iRE
Apache lights a fire under Hadoop with Spark, http://www. pcworld. com/article/2336380/apache-lights-a-fire-under-hadoop-with-spark. html#tk. rss_all
Pivotal and EMC are betting on Spark cousin Tachyon as in-memoryfilesystem, http://google. com/newsstand/s/CBIwxJba3x8
MemSQL extends in-memory database with Apache Spark connector, http://siliconangle. com/blog/2015/02/10/memsql-extends-in-memory-database-with-apache-spark-connector/
Apache Spark, http://www. cloudera. com/content/cloudera/en/products-and-services/cdh/spark. html
Survey reveals a few interesting numbers about Apache Spark, https://gigaom. com/2015/01/27/a-few-interesting-numbers-about-apache-spark/
Here's why Python and Scala aren't old news in the world of data science, http://google. com/newsstand/s/CBIwqpGhgA8
Apache Spark: Hadoop friend or foe?, http://siliconangle. com/blog/2015/02/05/apache-spark-hadoop-friend-or-foe/
Databricks demolishes big data benchmark to prove Spark is fast on disk, too, http://google. com/newsstand/s/CBIwj7e31ho
4 reasons why Spark could jolt Hadoop into hyperdrive, http://google. com/newsstand/s/CBIw-Nyvnh8
Mining Ecommerce Graph Data with Spark at Alibaba Taobao, https://databricks. com/blog/2014/08/14/mining-graph-data-with-spark-at-alibaba-taobao. html
The New Retail Reality Calls for the Death of Traditional POS, http://blog. demandware. com/tag/ecommerce/page/12
4 Reference Architectures To Optimize Your Ecommerce, http://www. rackspace. com/blog/4-reference-architectures-to-optimize-your-ecommerce/
Three New AWS Reference Architectures for E-Commerce, https://aws. amazon. com/blogs/aws/three-new-aws-reference-architectures-for-e-commerce/
Flipkart sends apology mail to customers after its botched 'Big Billion Day Sale', http://ibnlive. in. com/news/flipkart-sends-apology-mail-to-customers-after-its-botched-big-billion-day-sale/504504-7. html
Apache Flink 0. 8. 0 Released, Roadmap for 2015 Published, http://www. infoq. com/news/2015/01/apache-flink-0. 8. 0-released
Cloudera is rebuilding machine learning for Hadoop with Oryx,http://google. com/newsstand/s/CBIwqaqThAk
Haven Big Data Platform, http://www8. hp. com/in/en/software-solutions/big-data-platform-haven
HP Distributed R, http://www. vertica. com/hp-vertica-products/hp-vertica-distributed-r/
Spark fires up near-real-time big data, http://gcn. com/articles/2015/02/09/apache-spark. aspx

Index Terms

Computer Science

Information Sciences

Keywords

Apache Spark E-commerce Spark Streaming Spark SQL Shark MLlib Mahout SPork SparkR GraphX In-Memory Computing Distributed Architecture Big Data Streaming Engine Parallel Computing.