Data Sets

Data Sets Deep Learning Explainable AI Featured Posts Generative AI Python Synthetic Data

A New Type of Non-Standard High Performance DNN with Remarkable Stability

I explore deep neural networks (DNNs) starting from the foundations, introducing a new type of architecture, as much different from machine learning than it is from traditional AI. The original adaptive loss function introduced here for the first time, leads to spectacular performance improvements via a mechanism called equalization. To accurately approximate any response, rather […]

Read More
Books Courses Data Sets Explainable AI Featured Posts Generative AI Natural Language Processing Python Synthetic Data

10 Must-Read Articles and Books About Next-Gen AI in 2025

You could call it the best kept secret for professionals and experts in AI, as you won’t find these books and articles in traditional outlets. Yet, they are read by far more people than documents posted on ArXiv or published in scientific journals, so not really a secret. Actually, one of these books is also […]

Read More
Data Sets Experimental Math Featured Posts Python Synthetic Data

Universal Dataset to Test, Enhance and Benchmark AI Algorithms

This scientific research has three components. First, my most recent advances towards solving one of the most famous, multi-century old conjectures in number theory. One that kids in elementary school can understand, yet incredibly hard to prove. At the very core, it is about the spectacular quantum dynamics of the digit sum function. Then, I […]

Read More
Data Sets Experimental Math Featured Posts Machine Learning Natural Language Processing Python Stochastic Systems

LLM Challenge with Petabytes of Data to Prove Famous Number Theory Conjecture

In my recent article “Piercing the Deepest Mathematical Mystery” posted here, I paved the way to proving a famous multi-century old conjecture: are the digits of major mathematical constant such as π, e, log 2, or √2 evenly distributed? No one before ever managed to prove even the most basic trivialities, such as whether the […]

Read More
Data Sets Explainable AI Featured Posts Python Statistical Science Stochastic Systems Visualization

Visualizing Trading Strategies that Consistently Outperform the Stock Market

In this short paper, I discuss two topics. First, strategies to trade the S&P 500 index with few trades over long time periods, offering best exit, entry and re-entry points during the journey, to beat the baseline return. The baseline consists of staying long the whole time. The dataset has 40 years’ worth of daily […]

Read More
Data Sets Explainable AI Featured Posts Machine Learning Natural Language Processing Python

Hyperfast Contextual Custom LLM with Agents, Multitokens, Explainable AI, and Distillation

For the most recent article discussing the xLLM web API, follow this link. I discuss version 2.0 of my enterprise multi-LLM system called xLLM. Version 1.0 was presented in my recent article entitled “Custom Enterprise LLM/RAG with Real-Time Fine-Tuning”, posted here. Since version 2.0 is backward-compatible and consists of several important additions, I included all […]

Read More
Data Sets Explainable AI Featured Posts Generative AI Natural Language Processing Python

Custom Enterprise LLM/RAG with Real-Time Fine-Tuning

This article features an application of xLLM to extract information from a corporate corpus, using prompts referred to as “queries”. The goal is to serve the business user — typically an employee of the company or someone allowed access — with condensed, relevant pieces of information including links, examples, PDFs, tables, charts, definitions and so […]

Read More
Data Sets Deep Learning Featured Posts Synthetic Data Time Series

Synthesizing Multi-Table Databases: Model Evaluation & Vendor Comparison

Synthesizing multi-table tabular data presents its own challenges, compared to single-table. When the database contains date columns such as transaction or admission date, a frequent occurrence in real-world datasets, generating high quality synthetizations and model evaluation are even more complicated. In this article, we focus on this type of problems, comparing generated observations produced by […]

Read More
Data Sets Explainable AI Featured Posts Generative AI Natural Language Processing Python

Breakthrough: Zero-Weight LLM for Accurate Predictions and High-Performance Clustering

While most AI companies keep building LLMs with more weights and tokens (now one trillion is a standard number), I went in the opposite direction. Of course, zero weight means that there is no neural network behind the scenes. More specifically, it means that there is no lengthy Blackbox process to find the “best” weights […]

Read More
Data Sets Explainable AI Featured Posts Generative AI Natural Language Processing

Genome: Synthesizing DNA Sequences with LLM Techniques

This methodology is not focused on genome data alone. The purpose is to design a generic solution that may also work in other contexts, such as synthesizing molecules. The problem involves dealing with a large amount of “text”. Indeed, the sequences discussed here consist of letter arrangements, from an alphabet that has 5 symbols: A, […]

Read More
Exit mobile version