ml4se
500 subscribers
448 photos
1 file
526 links
Machine Learning for Software Engineering
Download Telegram
Google announces Bard, an experimental conversational AI service, powered by LaMDA. "Today, we’re taking another step forward by opening it up to trusted testers".
SantaCoder: Don't reach the stars!

The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. The authors train 1.1B parameter models on the Java, JavaScript, and Python subsets of The Stack and evaluate them on the MultiPL-E text-to-code benchmark. They find that more aggressive filtering of near-duplicates can further boost performance and, surprisingly, that selecting files from repositories with 5+ GitHub stars deteriorates performance significantly. Best model outperforms previous open-source multilingual code generation models (InCoder-6.7B and CodeGen-Multi-2.7B) in both left-to-right generation and infilling on the Java, JavaScript, and Python portions of MultiPL-E, despite being a substantially smaller model. All models are released under an OpenRAIL license at https://hf.co/bigcode.
Packing Unit Squares in Squares: A Survey and New Results

Let s(n) be the side of the smallest square into which we can pack n unit squares. The paper presents a history of this problem, and gives the best known upper and lower bounds for s(n) for n ≤ 100, including the best known packings.

Best known packings: https://erich-friedman.github.io/packing/squinsqu/
Transformer models: an introduction and catalog

Comprehensive and simple catalog and classification of the most popular Transformer models.

Table: https://docs.google.com/spreadsheets/d/1ltyrAB6BL29cOv2fSpNQnnq2vbX8UrHl47d7FkIf6t4/
👍1
Amazon’s Cloud Unit Partners With Startup Hugging Face as AI Deals Heat Up

Amazon.com Inc.’s cloud unit is expanding a partnership with artificial intelligence startup Hugging Face Inc., which is developing a ChatGPT rival, the latest move as the biggest technology firms line up allies in an attention-getting market for generative AI systems.
LLaMA is a collection of large language models ranging from 7B to 65B parameters [The FAIR team of Meta AI].

In order to download the checkpoints and tokenizer, fill google form
ITISE 2023: International Conference on Time Series and Forecasting

July 12th-14th, 2023. Gran Canaria (SPAIN)
Call for papers: deadline extension 2023.03.20
Topics: https://itise.ugr.es/topics.html
Debunking a myth: plant consciousness

Although it is widely believed that consciousness is an emergent property arising from complex networks of neurons, there is a hypothesis that plants have consciousness. In this paper, the authors present new arguments against plant consciousness, as well as new perspectives on past arguments.
FinerBench4BL: Large-Scale Evaluation of Method-Level Bug Localization with FinerBench4BL

FinerBench4BL is an evaluation framework for method-level information retrieval-based bug localization techniques, and a comparative study using this framework.
The 16th Annual AGI Conference

The AGI conferences, since the first one way back in 2008, have been organized by the Artificial General Intelligence Society. The 16th annual AGI conference (AGI-23) will be held as a mixed virtual/F2F event in Stockholm between June 16 and June 19, 2023.

Final Deadline for submitted Papers: March 12, 2023

Appropriate topics for contributed papers include, but are not restricted to:
AGI Architectures
Autonomy and Creativity
Benchmarks and Evaluation
Cognitive Modeling
Multi-Agent Interaction and Collaborative Intelligence
Theoretical Foundation of General Intelligence
Broader Implications of AGI
Knowledge Representation
Reinforcement and Learning Theory
Motivation, Emotion and Affect
Natural Language Understanding
Neurosymbolic AI
Perception and Perceptual Modeling
Reasoning, Inference and Planning
Robotic and Virtual Agent
Simulation and Evolutionary Computation
1
ChatML

OpenAI released ChatGPT API with Chat Markup Language. The basic idea behind ChatML is ensure the LLM model inputs are sent in structured format following ChatML and not as unstructured text.

https://github.com/openai/openai-python/blob/main/chatml.md
👍1
Cops: An Improved Information Retrieval-Based Bug Localization Technique Using Context-Aware Program Simplification

Authors propose a context-aware program simplification technique, which enables statement-level bug localization for Python-based projects. They evaluate COPS on the PyTraceBugs benchmark and compare it to state of-the-art techniques by using four widely used metrics.
👍2