ml4se
500 subscribers
448 photos
1 file
526 links
Machine Learning for Software Engineering
Download Telegram
Scientists and government representatives meeting at a conference in France have voted to scrap leap seconds by 2035, the organisation responsible for global timekeeping has said.

In November 2022 at the 27th General Conference on Weights and Measures, held about every four years at the Versailles Palace, it was decided to abandon the leap second by or before 2035. From then the difference between atomic and astronomical time will be allowed to grow to a larger value yet to be determined.
CS598: Machine Learning for Software Engineering

- Code representation and embeddings
- Source code analysis
- Code summarization
- Test input generation
- Fuzz testing
- Oracle inference
- Fault localization
- Program (bug) repair
- Regression testing
- Security testing and vulnerability detection
- Code completion
- Clone detection
🔥2
Course: Machine Learning for Software Engineering (Ural State University)

- Introduction to machine learning
- Introduction to Transformer
- Code representation 1
- Code representation 2
- Code generation
- Code summarization
- Clone detection
- Code search 1
- Code search 2
- Code completion
- Vulnerabilities
Large Language Models Can Self-Improve

CoT + multiple path decoding + self-consistency = effective self-training

74.4%->82.1% on GSM8K
78.2%->83.0% on DROP
90.0%->94.4% on OpenBookQA
63.4%->67.9% on ANLI-A3
Is effective self-training possible for small and medium-sized models?
Anonymous Poll
57%
Yes
43%
No
CodeQL code scanning launches Kotlin analysis support

Starting November 28, GitHub code scanning includes beta support for analyzing code written in Kotlin, powered by the CodeQL engine.
Advent of Code is an annual set of Christmas-themed computer programming challenges that follow an Advent calendar. It has been running since 2015. The programming puzzles cover a variety of skill sets and skill levels and can be solved using any programming language.

OpenAI Solved Part 1 in 10 Seconds
https://www.reddit.com/r/adventofcode/comments/zb942v/2022_day_03_first_place_for_part_1_today_10/
Ransomware Detection (Huawei)

* A baseline model is established based on historical data to check for any abnormalities in the changed feature values of the metadata of copies.
* Abnormal copies are further compared to determine file size changes, entropy values, and similarities.
* The Machine Learning (ML) model is used to determine whether file changes are caused by ransomware encryption, flagging them accordingly.
Python 2 removed from Debian
Microsoft is preparing to add OpenAI’s ChatGPT chatbot to its Bing search engine

OpenAI, the AI research shop backed by a $1 billion investment from Microsoft, publicly released ChatGPT for users to test in November. The chatbot’s ability to spout everything from cocktail recipes to authentic-seeming school essays has since catapulted it into the spotlight. While the AI service sometimes confidently offers incorrect information with a patina of authority, some analysts and experts have suggested its ability to summarize publicly available data can make it a credible alternative to Google search and a list of search-generated links.
The Art of LaTeX

Some common mistakes that are made by LaTeX practitioners (even in heavily cited papers)
On the Security Vulnerabilities of Text-to-SQL Models

Authors showed that the Text-to-SQL modules of two commercial black boxes (Baidu-UNIT and Codex-powered Ai2sql) can be manipulated to produce malicious code, potentially leading to data breaches and Denial of Service. This demonstrates the danger of NLP models being exploited as attack vectors in the wild. Moreover, experiments involving four open-source frameworks verified that simple backdoor attacks can achieve a 100% success rate on Text-to-SQL systems with almost no prediction performance impact.