Saturday, March 1, 2025

GenAI and LLMs development, trends and implications (17. - 23.2.2025)

Releases:
OpenAI cancels o3 release and announces roadmap for GPT 4.5, 5
OpenAI releases operator, an AI agent for web-based tasks
OmniHuman-1 released - AI-generated human animation
Latin America launches Latam-GPT to improve AI cultural relevance

Vision and video generation:

In software development:
Prompt engineering: Is it a new programming language?
Zero human code: What I learned from forcing AI to build (and fix) its own code for 27 straight days

And software testing:
Meta introduces LLM-powered tool for software testing
TDD and generative AI – a perfect pairing?
Generate unit tests with AI using Ollama and Spring Boot

Building apps:
Emerging patterns in building GenAI products - Guardrails
Build scalable GenAI applications in the cloud: From data preparation to deployment
Building intelligent microservices with Go and AWS AI services
Spring AI with Anthropic’s Claude models example

LLMs:
How LLMs work: Pre-training to post-training, neural networks, hallucinations, and inference
Dive into tokenization, attention, and key-value caching
The Delegated Chain of Thought architecture
A comprehensive guide to Generative AI training
Semantic clustering of user messages with LLM prompts - tutorial

LLMs and search:
Have LLMs solved the search problem?
Search: From basic document retrieval to answer generation

RAG:
Building a simple RAG application with Java and Quarkus
Creating an agentic RAG for Text-to-SQL applications
Multimodal RAG with Colpali, Milvus, and VLMs
Retrieval Augmented Generation in SQLite

Agents:
AI agents from zero to hero – part 1
Agentic workflows for unlocking user engagement insights
Azure AI Agent Service now in public preview for developers in AI Foundry SDK and Portal
Observability and DevTool platforms for AI agents
AI Agents: Future of automation or overhyped buzzword?

Future:
40% of AI data breaches will arise from cross-border GenAI misuse by 2027

Saturday, February 1, 2025

GenAI and LLMs development, trends and implications (20. - 26.1.2025)

What’s the real ROI of AI in 2025?

Google releases experimental AI reasoning model - Gemini 2.0 Flash Thinking Experimental
DeepSeek open-sources DeepSeek-V3, a 671B parameter mixture of experts LLM
Nvidia Ingest aims to make it easier to extract structured information from documents
Microsoft Research unveils rStar-Math, advancing mathematical reasoning in Small Language Models
Microsoft Phi-4 is a Small Language Model specialized for complex math reasoning
Amazon Bedrock introduces Multi-Agent Systems (MAS) with open-source framework Integration
Luma AI’s Ray2 video model is now available in Amazon Bedrock

Want to integrate AI into your business? Fine-tuning won’t cut it
Building successful AI Apps: The dos and don’ts
Agentic Mesh: Towards enterprise-grade agents

Advancing AI reasoning: Meta-CoT and system 2 thinking

Choose a database with a hybrid vector search for AI apps

A framework for building micro metrics for LLM system evaluation

Why LLMs suck at ASCII art
Large Language Models: A short introduction

Human minds vs. machine learning models - exploring the parallels and differences between psychology and machine learning
Understanding emergent capabilities in LLMs - lessons from biological systems

Chain-of-Thought Prompting - a comprehensive analysis of reasoning techniques in Large Language Models

RAG isn’t immune to LLM hallucination

Designing, building & deploying an AI chat app from scratch - part 1 and part 2
A guide to deploying AI for real-time content moderation
Real-time data streaming with AI

How LLMs are going to change code generation in modern IDEs
Meet Junie, your coding agent by JetBrains
"Fix with AI" button to automate Playwright test fixes
Collaborative Intelligence - maximizing human-AI partnerships in the workplace

Building effective agents with Spring AI (Part 1)
Fresh data for AI with Spring AI function calls
Powering LLMs with Apache Camel and LangChain4j

Saturday, January 25, 2025

GenAI and LLMs development, trends and implications (13. - 19.1.2025)

Prompt engineering has become an essential skill for working effectively with large language models (LLMs) - guide on the best prompt engineering books
Google unveiled PaLiGeMMA 2 - a family of vision-language models (VLM)
NVIDIA’s announces DIGITS - its first personal AI computer

Projects like AYA Expanse are exploring multilingual capabilities

Combining local and cloud models to build a multimodal AI assistant answering complex image questions, with the option to run everything locally

Importance of robust system memory as a key to personalized AI intelligence
Building reliable AI applications - LLM routing

Microsoft's framework for AI-driven cloud operations - AIOpsLab
Introducing Google's Vertex AI RAG engine
Enterprise RAG in Amazon Bedrock - learn details of Amazon Bedrock KnowledgeBases capability

Real-world applications and best practices using Azure AI and GPT-4
Developing an AI-powered smart guide for business planning & entrepreneurship

Supercharging RAG with MAS (Multi-Agent System)

The rise of reasoner models - scaling test-time compute
Advancing complex medical reasoning with HuatuoGPT-o1

Major LLMs have the capability to pursue hidden goals

And the future:

Sunday, October 20, 2024

GenAI and LLMs development, trends and implications (7. - 13.10.2024)

Adoption:
LLMs generally:
New models and functionality:
AI agents:
AI-enhanced software development:
Enhancing applications with GenAI:

Saturday, October 19, 2024

IT links (7. - 13.10.2024)


       Java Streams:




Sunday, October 13, 2024

GenAI and LLMs development, trends and implications (30.9. - 6.10.2024)

Adoption:
LLMs generally:
AI agents:
RAG:
AI-enhanced software development:
Enhancing applications with GenAI:

Monday, April 15, 2024

Saturday, April 13, 2024

Exploring Advanced AI Techniques: Ghost Attention, Thought Structures, Prompt Engineering and more

Diving deeper into the realm of generative AI, I've come across several articles that I find interesting as a beginner in this field:

The article - Understanding Ghost Attention in LLaMa 2 - delves deep into the technique of the ghost attention technique in LLaMa 2.

One example of providing instructions for specific chat is Prompt Instructions in Watsonx IBM service:

You define instructions in the upper input and then start to chat below.

One detailed look into how generative AI works is this article exploring the differences between "Chain of thoughts" and "Tree of thoughts" - Chain of Thoughts vs Tree of Thoughts for Language Learning Models (LLMs)

How to work better with these systems? You can improve the output using prompt patterns or n-shot prompting - 7 Prompt Patterns You Should Know

For controlling grounding data used by an LLM and constraining it for your enterprise Gen AI solutions, consider using Retrieval Augmented Generation (RAG). You can see how to use it, for example in Azure, here - Retrieval Augmented Generation (RAG) in Azure AI Search

Additionally, to gain more from LLMs, you can explore architecture patterns and mental models as described here - Generative AI Design Patterns: A Comprehensive Guide

Saturday, December 2, 2023

A comprehensive overview of generative AI and LLMs' trends, use cases, and future implications II. - Engineering and development insights

7 weeks (from 4.9. to 22.10.2023) in the world of Large Language Models and Generative AI tools, this time more focused on the engineering side:


Prompt engineering:

Parallel processing in prompt engineering: the skeleton-of-thought technique.

Unlocking reliable generations through Chain-of-Verification - a leap in prompt engineering.

LLMOps: production prompt engineering patterns with Hamilton.

Crafting different types of program simulation prompts - defining the new program simulation prompt framework.

Some kick-ass prompt engineering techniques to boost our LLM models.

And other prompt engineering tips, a neural network how-to, and recent must-reads.


AI Development and Engineering:

The team behind GitHub Copilot shares its lessons from building the app.

Amazon Bedrock for building and scaling generative applications is now generally available.

Experience from building generative AI apps on Amazon Web Services, using Amazon Bedrock and SageMaker.

A guide with 7 steps for mastering LLMs.

Key tools for enhancing Generative AI in Data Lake Houses.

An introduction to loading Large Language models.

Introduction to ML engineering and LLMOps with OpenAI and LangChain.

MLOps and LLM deployment strategies for software engineers.

Modern MLOps platform for Generative AI.

Leveraging the power of LLMs to guide AutoML hyperparameter searches.

LLMs demand Observability-Driven Development.

LLM monitoring and observability — a summary of techniques and approaches.

How to build and benchmark your LLM evals.

A step-by-step guide to selecting and running your own generative model.

Google Research: Outperforming larger language models with less training data and smaller model sizes - distilling step-by-step.

Google Research: Rethinking calibration for in-context learning and prompt engineering.

Apache Kafka as a mission-critical Data Fabric for GenAI.

Training ChatGPT on your own data.

Hugging Face's guide to optimizing LLMs in production.

Hugging Face is becoming the "GitHub" for Large Language Models.

Building microservice for multi-chat backends using Llama and ChatGPT.

Connect GPT models with company data in Microsoft Azure.

Tuning LLMs with MakerSuite.

Fine-tuning LLMs: Parameter Efficient Fine Tuning (PEFT), LoRA and QLoRA.

How to train BERT for masked language modeling tasks.

Extending context length in Large Language Models.

Conversational applications with Large Language Models understanding the sequence of user inputs, prompts, and responses.

Using data lakes and Large Language Models in development.

How to build an LLM from scratch.

LLM output parsing: function calling vs. LangChain.

Enhancing the power of Llama 2: 3 easy methods for improving your Large Language Model.


Keeping LLMs relevant and current - Retrieval Augmented Generation (RAG).

Build and deploy Retrieval Augmented Generative Pipelines with Haystack.

Why your RAG is not reliable in a production environment.


QCon San Francisco: 

Unlocking enterprise value with Large Language Models.

A modern compute stack for scaling large AI, ML, & LLM workloads.

Saturday, November 25, 2023

A comprehensive overview of generative AI and LLMs' trends, use cases, and future implications I. - Business, technology trends and applications

7 weeks (from 4.9. to 22.10.2023) in the world of Large Language Models and Generative AI tools:


AI in Business and Technology Trends:

How OpenAI turned LLMs into a mainstream success.

Oracle outlines a vision for AI and a cloud-driven future.

Enterprise SaaS companies have announced generative AI features, threatening AI startups.

How Generative AI is disrupting data practices.

Data Provenance in the age of Generative AI.

Is ChatGPT going to take data science jobs?

40% of the labour force will be affected by AI in 3 years.

And Gartner says: 

55% of organizations are in piloting or production mode with Generative AI.

CIOs must prioritize their AI ambition and AI-ready scenarios for next 12-24 months.

More than 80% of enterprises will have used Generative AI APIs or deployed Generative AI-enabled applications by 2026.

60% of seller work to be executed by Generative AI technologies within five years.


AI Applications and Use Cases:

Large Language Models in real-world customer experience applications.

Five generative AI use cases companies can implement today.

Five use cases for CFOs using generative AI.

Revolutionizing business automation with generative AI.

Redefining conversational AI with Large Language Models.

Pros and cons of LLMs for bad content moderation.

Generative AI on research papers using the Nougat model.

Document topic extraction with Large Language Models and the Latent Dirichlet Allocation (LDA) algorithm.

Using AI to add vector search to Cassandra in six weeks.