ml4se
500 subscribers
448 photos
1 file
526 links
Machine Learning for Software Engineering
Download Telegram
Ransomware Detection (Huawei)

* A baseline model is established based on historical data to check for any abnormalities in the changed feature values of the metadata of copies.
* Abnormal copies are further compared to determine file size changes, entropy values, and similarities.
* The Machine Learning (ML) model is used to determine whether file changes are caused by ransomware encryption, flagging them accordingly.
Python 2 removed from Debian
Microsoft is preparing to add OpenAI’s ChatGPT chatbot to its Bing search engine

OpenAI, the AI research shop backed by a $1 billion investment from Microsoft, publicly released ChatGPT for users to test in November. The chatbot’s ability to spout everything from cocktail recipes to authentic-seeming school essays has since catapulted it into the spotlight. While the AI service sometimes confidently offers incorrect information with a patina of authority, some analysts and experts have suggested its ability to summarize publicly available data can make it a credible alternative to Google search and a list of search-generated links.
The Art of LaTeX

Some common mistakes that are made by LaTeX practitioners (even in heavily cited papers)
On the Security Vulnerabilities of Text-to-SQL Models

Authors showed that the Text-to-SQL modules of two commercial black boxes (Baidu-UNIT and Codex-powered Ai2sql) can be manipulated to produce malicious code, potentially leading to data breaches and Denial of Service. This demonstrates the danger of NLP models being exploited as attack vectors in the wild. Moreover, experiments involving four open-source frameworks verified that simple backdoor attacks can achieve a 100% success rate on Text-to-SQL systems with almost no prediction performance impact.
LineVul: A Transformer-based Line-Level Vulnerability Prediction

The authors propose a novel approach to detecting vulnerabilities in source code. The approach uses machine learning and works at line level.

Code
An Analysis of the Automatic Bug Fixing Performance of ChatGPT

The paper is devoted to evaluation ChatGPT on the standard bug fixing benchmark set, QuixBugs. ChatGPT’s bug fixing performance is competitive to the common deep learning approaches CoCoNut and Codex and notably better than the results reported for the standard program repair approaches. In contrast to previous approaches, ChatGPT offers a dialogue system through which further information, e.g., the expected output for a certain input or an observed error message, can be entered. By providing such hints to ChatGPT, its success rate can be further increased, fixing 31 out of 40 bugs, outperforming state-of-the-art.
Which Features are Learned by CodeBert: An Empirical Study of the BERT-based Source Code Representation Learning

Recently researchers applied the BERT to source-code representation learning and reported some good news on several downstream tasks. However, in this paper, the authors illustrated that current methods cannot effectively understand the logic of source codes.