Call for Paper
CAE solicits original research papers for the May 2023 Edition. Last date of manuscript submission is April 30, 2023.
A Reference Architecture and Road map for Enabling E-commerce on Apache Spark
Mohit Sewak and Sachchidanand Singh. Article: A Reference Architecture and Road map for Enabling E-commerce on Apache Spark. Communications on Applied Electronics 2(1):37-42, June 2015. Published by Foundation of Computer Science, New York, USA. BibTeX
@article{key:article, author = {Mohit Sewak and Sachchidanand Singh}, title = {Article: A Reference Architecture and Road map for Enabling E-commerce on Apache Spark}, journal = {Communications on Applied Electronics}, year = {2015}, volume = {2}, number = {1}, pages = {37-42}, month = {June}, note = {Published by Foundation of Computer Science, New York, USA} }
Abstract
Apache Spark is an execution engine that besides working as an isolated distributed, in-memory computing engine also offers close integration with Hadoop's distributed file system (HDFS). Apache Spark's underlying appeal is in providing a unified framework to create sophisticated applications involving workloads. It unifies multiple workloads, handles unstructured data very well and has easy-to-use APIs. Apache Spark also offers a streaming component called Spark Streaming, which can write the streamed data in the same data structures, also resides in-memory and can also be read by the Spark's Spark SQL component running on top of core Spark framework. Apache Spark has the ability to provide online machine learning, through its MLlib, and SparkR sub projects. With these, besides streaming data it can also execute machine-learning libraries, functions or algorithms. This paper analyzes Apache Spark and highlights the role of Apache Spark (and eco-system) in the architecture of a modern E-commerce platform. This paper also aims to propose horizontally and vertically scalable reference architectures for both small and medium (SME) & large E-commerce enterprises.
Reference
- Gartner Says India eCommerce Market To Reach $6 Billion in 2015, http://www. gartner. com/newsroom/id/2876517
- US eCommerce Forecast: 2013 To 2018, https://www. forrester. com/US+eCommerce+Forecast+2013+To+2018/fulltext/-/E-RES115513
- Finding a Spark at Yahoo! http://blogs. gartner. com/nick-heudecker/finding-a-spark-at-yahoo/
- MapR announces Apache Drill and Apache Spark integration, http://www. itwire. com/it-industry-news/development/65714-mapr-announces-apache-drill-and-apache-spark-integration
- Hortonworks Invests In Spark On Hadoop, http://www. informationweek. com/big-data/big-data-analytics/hortonworks-invests-in-spark-on-hadoop/d/d-id/1316035
- Let Spark Fly: Advantages and Use Cases for Spark on Hadoop, https://www. mapr. com/blog/let-spark-fly-advantages-and-use-cases-spark-hadoop-webinar-follow#. VX7eBGMpldF
- Cloudera Offers Apache Spark For Hadoop Big Data, http://google. com/newsstand/s/CBIwxYu3iRE
- Apache lights a fire under Hadoop with Spark, http://www. pcworld. com/article/2336380/apache-lights-a-fire-under-hadoop-with-spark. html#tk. rss_all
- Pivotal and EMC are betting on Spark cousin Tachyon as in-memoryfilesystem, http://google. com/newsstand/s/CBIwxJba3x8
- MemSQL extends in-memory database with Apache Spark connector, http://siliconangle. com/blog/2015/02/10/memsql-extends-in-memory-database-with-apache-spark-connector/
- Apache Spark, http://www. cloudera. com/content/cloudera/en/products-and-services/cdh/spark. html
- Survey reveals a few interesting numbers about Apache Spark, https://gigaom. com/2015/01/27/a-few-interesting-numbers-about-apache-spark/
- Here's why Python and Scala aren't old news in the world of data science, http://google. com/newsstand/s/CBIwqpGhgA8
- Apache Spark: Hadoop friend or foe?, http://siliconangle. com/blog/2015/02/05/apache-spark-hadoop-friend-or-foe/
- Databricks demolishes big data benchmark to prove Spark is fast on disk, too, http://google. com/newsstand/s/CBIwj7e31ho
- 4 reasons why Spark could jolt Hadoop into hyperdrive, http://google. com/newsstand/s/CBIw-Nyvnh8
- Mining Ecommerce Graph Data with Spark at Alibaba Taobao, https://databricks. com/blog/2014/08/14/mining-graph-data-with-spark-at-alibaba-taobao. html
- The New Retail Reality Calls for the Death of Traditional POS, http://blog. demandware. com/tag/ecommerce/page/12
- 4 Reference Architectures To Optimize Your Ecommerce, http://www. rackspace. com/blog/4-reference-architectures-to-optimize-your-ecommerce/
- Three New AWS Reference Architectures for E-Commerce, https://aws. amazon. com/blogs/aws/three-new-aws-reference-architectures-for-e-commerce/
- Flipkart sends apology mail to customers after its botched 'Big Billion Day Sale', http://ibnlive. in. com/news/flipkart-sends-apology-mail-to-customers-after-its-botched-big-billion-day-sale/504504-7. html
- Apache Flink 0. 8. 0 Released, Roadmap for 2015 Published, http://www. infoq. com/news/2015/01/apache-flink-0. 8. 0-released
- Cloudera is rebuilding machine learning for Hadoop with Oryx,http://google. com/newsstand/s/CBIwqaqThAk
- Haven Big Data Platform, http://www8. hp. com/in/en/software-solutions/big-data-platform-haven
- HP Distributed R, http://www. vertica. com/hp-vertica-products/hp-vertica-distributed-r/
- Spark fires up near-real-time big data, http://gcn. com/articles/2015/02/09/apache-spark. aspx
Keywords
Apache Spark, E-commerce, Spark Streaming, Spark SQL, Shark, MLlib, Mahout, SPork, SparkR, GraphX, In-Memory Computing, Distributed Architecture, Big Data, Streaming Engine, Parallel Computing.