Proceedings of the Institute for System Programming of the RAS
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Proceedings of the Institute for System Programming of the RAS, 2015, Volume 27, Issue 5, Pages 35–48
DOI: https://doi.org/10.15514/ISPRAS-2015-27(5)-3
(Mi tisp171)
 

This article is cited in 3 scientific papers (total in 3 papers)

Implementing Apache Spark jobs execution and Apache Spark cluster creation for OpenStack Sahara

A. Aleksiyantsa, O. Borisenkoa, D. Turdakovabc, A. Shera, S. Kuznetsovadb

a Institute for System Programming of the RAS
b Lomonosov Moscow State University
c National Research University "Higher School of Economics" (HSE)
d Moscow Institute of Physics and Technology (State University)
Full-text PDF (277 kB) Citations (3)
References:
Abstract: In this paper the problem of creating virtual clusters in clouds for big data analysis with Apache Hadoop and Apache Spark is discussed. Both clouds and MapReduce models are popular nowadays for a bunch of reasons: cheapness and efficient big data analysis respectively. For these thoughts, having an open source solution for building clusters is important. The article gives an overview on existing methods for Apache Spark cluster creation in clouds. We consider two open source cloud engines OpenStack and Eucalyptus and the most popular proprietary cloud service Amazon Web Services and examine cloud related features presented by these systems. Afterwards, we regard possible ways of creating virtual clusters for big data processing in OpenStack and describe their pros and cons. In the second part we describe in details one of these solutions that uses service Sahara. Sahara represents a cluster management system for OpenStack and it is used for setting up virtual clusters and executing MapReduce jobs. Sahara did not support contemporary versions of Apache Spark. The article introduces the results of our work that led to a Sahara modification, describes its idea and implementation details. By virtue of our modification, Sahara is able to create and use virtual clusters with contemporary versions of Apache Spark in OpenStack clouds.
Keywords: Apache Spark, Openstack, Openstack Sahara, IaaS, PaaS.
Funding agency Grant number
Russian Foundation for Basic Research 14-07-00602
The work is supported by the RFBR, grant No 14-07-00602
Bibliographic databases:
Document Type: Article
Language: English
Citation: A. Aleksiyants, O. Borisenko, D. Turdakov, A. Sher, S. Kuznetsov, “Implementing Apache Spark jobs execution and Apache Spark cluster creation for OpenStack Sahara”, Proceedings of ISP RAS, 27:5 (2015), 35–48
Citation in format AMSBIB
\Bibitem{AleBorTur15}
\by A.~Aleksiyants, O.~Borisenko, D.~Turdakov, A.~Sher, S.~Kuznetsov
\paper Implementing Apache Spark jobs execution and Apache Spark cluster creation for OpenStack Sahara
\jour Proceedings of ISP RAS
\yr 2015
\vol 27
\issue 5
\pages 35--48
\mathnet{http://mi.mathnet.ru/tisp171}
\crossref{https://doi.org/10.15514/ISPRAS-2015-27(5)-3}
\elib{https://elibrary.ru/item.asp?id=25141693}
Linking options:
  • https://www.mathnet.ru/eng/tisp171
  • https://www.mathnet.ru/eng/tisp/v27/i5/p35
  • This publication is cited in the following 3 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Proceedings of the Institute for System Programming of the RAS
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2025