Proceedings of the Institute for System Programming of the RAS
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Proceedings of the Institute for System Programming of the RAS, 2023, Volume 35, Issue 1, Pages 237–264
DOI: https://doi.org/10.15514/ISPRAS-2023-35(1)-15
(Mi tisp765)
 

Comparison of graph embeddings for source code with text models based on CNN and CodeBert architectures

V. A. Romanov, V. V. Ivanov

Innopolis University
Abstract: One possible way to reduce bugs in source code is to create intelligent tools that make the development process easier. Such tools often use vector representations of the source code and machine learning techniques borrowed from the field of natural language processing. However, such approaches do not take into account the specifics of the source code and its structure. This work studies methods for pretraining graph vector representations for source code, where the graph represents the structure of the program. The results show that graph embeddings allow to achieve an accuracy of classifying variable types in Python programs that is comparable to CodeBERT embeddings. Moreover, the simultaneous use of text and graph embeddings as part of a hybrid model can improve the accuracy of type classification by more than 10%.
Keywords: source code, variable type prediction, Python, graph neural networks, CodeBERT
Funding agency Grant number
Russian Science Foundation 22-21-00493
Document Type: Article
Language: Russian
Citation: V. A. Romanov, V. V. Ivanov, “Comparison of graph embeddings for source code with text models based on CNN and CodeBert architectures”, Proceedings of ISP RAS, 35:1 (2023), 237–264
Citation in format AMSBIB
\Bibitem{RomIva23}
\by V.~A.~Romanov, V.~V.~Ivanov
\paper Comparison of graph embeddings for source code with text models based on CNN and CodeBert architectures
\jour Proceedings of ISP RAS
\yr 2023
\vol 35
\issue 1
\pages 237--264
\mathnet{http://mi.mathnet.ru/tisp765}
\crossref{https://doi.org/10.15514/ISPRAS-2023-35(1)-15}
Linking options:
  • https://www.mathnet.ru/eng/tisp765
  • https://www.mathnet.ru/eng/tisp/v35/i1/p237
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Proceedings of the Institute for System Programming of the RAS
    Statistics & downloads:
    Abstract page:57
    Full-text PDF :66
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2025