Bayaml¶

Bayaml is a lightweight ML & AI orchestration framework designed for structured, reproducible, and deployable machine learning workflows.

It features a deterministic execution engine and an intelligent Auto Mode (.auto()) that interprets structured instructions using rule-based NLP, without relying on external AI services. Your data remains completely local and secure.

Auto Mode automatically manages:

Data loading
Safe preprocessing
Task detection (classification or regression)
Structured pipeline construction
Model training
Evaluation
Deployment

Bayaml includes built-in AutoML capabilities such as model comparison, cross-validation, leaderboard tracking, and best-model selection — all executed within reproducible, hash-based execution plans.

It provides structured APIs across simple, fluent, and advanced orchestration layers, along with natural-language-driven ML pipelines and export options including REST deployment and ONNX.

Designed for engineers, researchers, startups, and ML teams.

Why Bayaml?¶

Modern ML pipelines are often:

Hard to reproduce
Poorly structured
Over-engineered
Difficult to deploy

Bayaml provides a clean orchestration layer on top of scikit-learn to make ML:

Reproducible
Structured
Deterministic
Deployable

Without sacrificing flexibility.

Natural Language ML (.auto())¶

Train models using structured instructions:

from bayaml import Project

p = Project()

result = p.auto(
    "use https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv "
    "treat tip as target "
    "train regression model using linear regression"
)

print(result)

What Happens Automatically¶

Dataset loading (CSV, URL, DataFrame)
Target detection
Task detection (classification/regression)
Automatic categorical encoding
Safe pipeline ordering
Train-test split
Model training
Evaluation
Deterministic plan hashing

This makes Bayaml a lightweight AutoML orchestration engine.

Execution Plan Preview¶

Preview generated pipelines before execution:

plan = p.auto(
    "use data.csv treat price as target train regression model",
    preview=True
)

for step in plan.steps:
    print(step.name)

Bayaml builds a deterministic execution plan with hashing for reproducibility.

Core Capabilities¶

Structured ML Orchestration Layered architecture wrapping a deterministic execution engine.
Auto Encoding Categorical columns automatically encoded before training.
Auto Task Detection Classification vs regression inferred automatically.
AutoML Engine Model comparison, leaderboard tracking, cross-validation.
Model Deployment Export models for production use.

Deployment¶

REST Deployment¶

p.auto(
    "use data.csv treat target as target "
    "train classification model "
    "deploy as rest"
)

Generates a deployable REST bundle.

ONNX Export (Edge / C++ Ready)¶

p.auto(
    "use data.csv treat target as target "
    "train regression model "
    "deploy in c++"
)

Exports model as ONNX.

Installation¶

pip install bayaml

Dependencies:

pandas
numpy
scikit-learn
matplotlib
pyyaml

API Layers¶

Bayaml supports three levels of abstraction.

1. Simple API¶

from bayaml import quick_train

metrics = quick_train(
    data="data.csv",
    target="Target",
    model="linear_regression"
)

2. Fluent API¶

from bayaml import Bayaml

metrics = (
    Baya("data.csv", target="Target")
    .train("logistic_regression")
    .evaluate()
)

3. Advanced Orchestration API¶

from bayaml import Project

project = Project()
project.data.load("data.csv")
project.data.set_target("Target")
project.split.train_test()
project.model.create("linear_regression")
project.model.train()
project.evaluate.evaluate_regressor()

Full modular control.

AutoML¶

from bayaml import automl

result = automl(
    data="data.csv",
    target="Target"
)

print(result["best_model"])
print(result["best_score"])

Includes:

Cross-validation
Model comparison
Leaderboard generation
Best model selection
Run tracking

Export System¶

Export metrics, predictions, and reports:

CSV
JSON
Excel (XLSX)
PDF
DOCX
PNG / JPG

Output Modes¶

Bayaml supports multiple output rendering modes, allowing results to be displayed in different formats depending on workflow, integration needs, or presentation style.

These modes control how model results, metrics, and execution details are formatted — without affecting the underlying execution.

Available Output Modes¶

pretty – Human-readable formatted output
original – Raw native Bayaml execution result
sklearn – scikit-learn compatible output format
pandas – Pandas DataFrame structured output
numpy – NumPy array structured output
json – Machine-readable JSON output
table – Clean tabular display format
markdown – Markdown-rendered output
latex – LaTeX formatted output
diagnostic – Detailed execution diagnostics

Example Usage¶

result = p.auto(
    "use iris.csv treat species as target train classification model",
    mode="pretty"
)

result = p.auto(
    "use iris.csv treat species as target train classification model",
    mode="json"
)

Each mode adapts the output structure while keeping the execution deterministic and reproducible.

Why This Matters¶

These output modes allow Bayaml to integrate seamlessly with:

Research workflows (LaTeX, Markdown)
Production systems (JSON, NumPy)
Data pipelines (Pandas, sklearn)
Debugging and diagnostics
Human-readable reporting

This flexibility bridges experimentation, reporting, and deployment.

Architecture¶

Core Engine ↓ Execution Plan Builder ↓ Deterministic Plan Hashing ↓ Project API ↓ AutoML / CLI / Simple API

All APIs wrap the same deterministic engine.

No duplicated logic.

Reproducibility¶

Every .auto() run generates:

Dataset hash
Plan hash
Ordered execution steps

Ensuring reproducible ML workflows.

Developer Setup¶

git clone https://github.com/adityassarode/bayaml
cd bayaml
pip install -e .[dev]
pytest

Positioning¶

Bayaml is ideal for:

ML engineers building reproducible pipelines
Startups needing fast ML deployment
Data scientists wanting structured automation
Teams needing deterministic ML workflows

License¶

MIT License

Author¶

Bayaml is built and maintained by Aditya Sarode, focused on scalable AI systems, ML architecture, and production-ready engineering.

Documentation: