Commit be6afc29 by Leo

upload code

parent 443d5bdf
# 人工智能中的数学基础与代码映射
# 人工智能中的数学基础与代码映射
人工智能(AI)依赖多个数学领域的知识。本文整理了 AI 中常见的数学概念及其在 Python(主要用 NumPy 和 PyTorch)中的简单实现,方便理解与实践。
---
## 高等数学基础
### 微积分
微积分是研究变化率和累积量的数学工具。在深度学习中,模型的训练过程本质上是一个连续函数的优化问题,涉及导数、偏导数和极值点的求解。
**代码示例:用导数表示函数斜率**
```python
import numpy as np
def f(x):
return x**2 + 3*x + 2
# 数值导数
def numerical_derivative(f, x, eps=1e-6):
return (f(x + eps) - f(x - eps)) / (2 * eps)
x = 1.0
print("导数值:", numerical_derivative(f, x))
```
### 链式法则(反向传播核心)
神经网络中的梯度计算依赖链式法则。
**代码示例:简单的链式法则演示**
```python
# y = f(g(x)),其中 f(x) = x^2, g(x) = 3x + 1
x = 2.0
g = 3 * x + 1
f = g**2
# 链式法则:dy/dx = df/dg * dg/dx
df_dg = 2 * g
dg_dx = 3
dy_dx = df_dg * dg_dx
print("dy/dx =", dy_dx)
```
### 梯度下降
梯度:梯度表示函数在某一点的最陡上升方向。模型训练中的梯度下降法正是利用梯度来不断更新参数,使损失函数最小化,从而得到更优的模型。
用于最小化损失函数,优化模型参数。
**代码示例:梯度下降优化一个简单函数**
```python
w = 5.0 # 初始参数
lr = 0.1 # 学习率
for i in range(100):
grad = 2 * (w - 3) # 目标函数: (w - 3)^2
w -= lr * grad
print("优化后的 w:", w)
```
---
## 线性代数
线性代数在 AI 中的作用主要体现在以下几个方面:
* **数据表示**:图像、文本、音频等高维数据通常表示为向量或矩阵,便于模型处理。
* **模型构建**:神经网络中的每一层都可以看作是一次线性变换,使用权重矩阵对输入向量进行变换。
* **前向传播**:通过输入向量与权重矩阵的乘法以及激活函数完成特征提取。
* **反向传播与梯度计算**:涉及大量矩阵的导数运算和乘法,用于计算参数的更新方向。
* **特征分解与降维**:如 PCA(主成分分析)和 SVD(奇异值分解),用于降维和数据压缩。
### 向量与矩阵运算
**代码示例:矩阵乘法和前向传播**
```python
import numpy as np
x = np.array([[1, 2]]) # 输入
W = np.array([[0.1, 0.2], # 权重矩阵
[0.3, 0.4]])
b = np.array([[0.5, 0.6]]) # 偏置
output = np.dot(x, W) + b
print("输出结果:", output)
```
---
### 特征分解与降维(PCA)
**代码示例:PCA 特征提取(简化版)**
```python
from sklearn.decomposition import PCA
from sklearn.datasets import load_iris
X = load_iris().data
pca = PCA(n_components=2)
X_reduced = pca.fit_transform(X)
print("降维后的数据:", X_reduced[:5])
```
---
## 概率论与统计
概率论和统计学为人工智能提供了刻画不确定性、推断规律和构建模型的基本方法:
* **概率模型**:描述随机变量及其关系,如联合概率、条件概率等。
* **统计推断**:包括最大似然估计(MLE)和假设检验。
* **贝叶斯定理**:结合先验知识与观测数据推断后验概率。
* **最大似然估计(MLE)**:用于通过数据找出使得观测概率最大的模型参数。
---
### 贝叶斯定理
**代码示例:朴素贝叶斯思想实现**
```python
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
model = GaussianNB()
model.fit(X, y)
print("预测:", model.predict([X[0]]))
```
---
## 信息论
信息论为人工智能提供了量化信息和构建高效模型的工具:
* **信息熵**:衡量不确定性,评估数据集的纯度和模型预测信心。
* **互信息(MI)**:衡量变量之间的相关性。
* **交叉熵**:常用作分类任务中的损失函数。
---
## 优化算法
优化算法是训练 AI 模型的核心,目标是最小化损失函数,提高模型性能:
* **梯度下降法(Gradient Descent)**
* **动量法、Adagrad、RMSProp、Adam** 等自适应优化方法
### 基本梯度下降(用于拟合模型)
```python
import torch
# 目标函数: y = (w - 2)^2
w = torch.tensor(5.0, requires_grad=True)
optimizer = torch.optim.SGD([w], lr=0.1)
for i in range(100):
loss = (w - 2)**2
loss.backward()
optimizer.step()
optimizer.zero_grad()
print("训练后的 w:", w.item())
```
### 补充资料:gnn.club
\ No newline at end of file
{
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "f72cecef",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"导数值: 4.999999999810711\n"
]
}
],
"source": [
"import numpy as np # 导入numpy库,主要用于科学计算\n",
"\n",
"# 定义一个函数f(x),计算表达式 x^2 + 3x + 2 的值\n",
"def f(x):\n",
" return x**2 + 3*x + 2\n",
"\n",
"# 定义一个数值导数函数 numerical_derivative\n",
"# 输入:\n",
"# f :目标函数\n",
"# x :求导点\n",
"# eps :一个非常小的数,用来计算导数的差分间隔,默认是1e-6\n",
"def numerical_derivative(f, x, eps=1e-6):\n",
" # 利用中心差分公式近似导数:\n",
" # f'(x) ≈ [f(x + eps) - f(x - eps)] / (2 * eps)\n",
" return (f(x + eps) - f(x - eps)) / (2 * eps)\n",
"\n",
"x = 1.0 # 设定求导点x=1.0\n",
"\n",
"# 打印在x=1.0处的导数近似值 \n",
"print(\"导数值:\", numerical_derivative(f, x))\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "8449ad63",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dy/dx = 42.0\n"
]
}
],
"source": [
"# 复合函数的例子:y = f(g(x))\n",
"# 其中 f(x) = x^2,g(x) = 3x + 1\n",
"x = 2.0 # 给定自变量x的值\n",
"\n",
"g = 3 * x + 1 # 计算内部函数g(x)的值,g(2) = 3*2 + 1 = 7\n",
"f = g**2 # 计算外部函数f(g)的值,f(7) = 7^2 = 49\n",
"\n",
"# 链式法则求导:\n",
"# dy/dx = df/dg * dg/dx\n",
"# df/dg 是 f 对 g 的导数,f(g) = g^2,因此 df/dg = 2g\n",
"df_dg = 2 * g # 计算 df/dg,此处为 2 * 7 = 14\n",
"\n",
"# dg/dx 是 g 对 x 的导数,g(x) = 3x + 1,导数为3\n",
"dg_dx = 3 # 计算 dg/dx,值为3\n",
"\n",
"# 计算 dy/dx = df/dg * dg/dx = 14 * 3 = 42\n",
"dy_dx = df_dg * dg_dx\n",
"\n",
"print(\"dy/dx =\", dy_dx) # 输出导数值\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "02622489",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"优化后的 w: 3.0000000004074074\n"
]
}
],
"source": [
"# 初始化参数 w,目标是通过优化让 w 接近 3\n",
"w = 5.0 # 初始参数值为 5,距离目标值 3 有一定的偏差\n",
"\n",
"# 设置学习率(learning rate),控制每次参数更新的步长\n",
"lr = 0.1 # 学习率是 0.1,不能太大也不能太小\n",
"\n",
"# 梯度下降的迭代过程\n",
"for i in range(100): # 执行 100 次迭代更新\n",
" grad = 2 * (w - 3) # 计算损失函数 (w - 3)^2 的梯度,公式推导如下:\n",
" # 假设损失函数 f(w) = (w - 3)^2\n",
" # 则 f'(w) = 2*(w - 3)\n",
" \n",
" w -= lr * grad # 使用梯度下降法更新参数 w\n",
" # w = w - 学习率 × 梯度\n",
" # 目的是逐步减小损失函数的值,向最小值(最优 w=3)靠近\n",
"\n",
"# 输出最终优化后的参数 w,应该非常接近 3\n",
"print(\"优化后的 w:\", w)\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "7988f5f1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"输出结果: [[1.2 1.6]]\n"
]
}
],
"source": [
"import numpy as np # 导入NumPy库,用于处理数组和矩阵运算\n",
"\n",
"x = np.array([[1, 2]]) # 定义输入数据x,是一个形状为(1, 2)的二维数组,表示有两个特征\n",
"\n",
"W = np.array([[0.1, 0.2], # 定义权重矩阵W,形状为(2, 2),表示有2个输入和2个输出神经元\n",
" [0.3, 0.4]])\n",
"\n",
"b = np.array([[0.5, 0.6]]) # 定义偏置项b,形状为(1, 2),每个输出神经元对应一个偏置值\n",
"\n",
"output = np.dot(x, W) + b # 先进行矩阵乘法x·W(形状变为1x2),再加上偏置b(逐元素加法)\n",
" # 即 output = xW + b,表示一层神经网络的线性变换\n",
"\n",
"print(\"输出结果:\", output) # 打印输出结果,形如 [[1.2 1.6]]\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "f345a811",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[1 2]]\n",
"[[0.1 0.2]\n",
" [0.3 0.4]]\n"
]
}
],
"source": [
"print(x)\n",
"print(W)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6972756f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"降维后的数据: [[-2.68412563 0.31939725]\n",
" [-2.71414169 -0.17700123]\n",
" [-2.88899057 -0.14494943]\n",
" [-2.74534286 -0.31829898]\n",
" [-2.72871654 0.32675451]]\n"
]
}
],
"source": [
"from sklearn.decomposition import PCA # 从sklearn库中导入PCA类,用于主成分分析降维\n",
"from sklearn.datasets import load_iris # 导入load_iris函数,用来加载鸢尾花数据集\n",
"\n",
"X = load_iris().data # 加载鸢尾花数据集的特征数据,X是一个形状为(150, 4)的数组,表示150个样本,每个样本4个特征\n",
"\n",
"pca = PCA(n_components=2) # 创建一个PCA对象,指定降维到2个主成分(把4维数据降到2维)\n",
"\n",
"X_reduced = pca.fit_transform(X) \n",
"# 先用X训练PCA模型(fit),计算主成分方向\n",
"# 然后将原始数据X映射到这两个主成分构成的新空间中(transform)\n",
"# fit_transform是fit和transform的合并操作,输出降维后的数据,形状为(150, 2)\n",
"\n",
"print(\"降维后的数据:\", X_reduced[:5]) \n",
"# 打印降维后数据的前5条样本,每条样本现在只有两个特征(两个主成分的值)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0f8108c5",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from sklearn.datasets import load_iris\n",
"\n",
"iris = load_iris() \n",
"df = pd.DataFrame(iris.data, columns=iris.feature_names)\n",
"df['label'] = iris.target\n",
"print(df.head()) # 打印前5行\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "pytorch_gpu",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
-- "a/3-Python\347\274\226\347\250\213\345\237\272\347\241\200/3.5-\346\225\260\346\215\256\345\210\206\346\236\220\345\267\245\345\205\267\345\214\205/.gitkeep"
++ /dev/null
[
[
{
"id": "001",
"name": "点头教育",
"url": "www.diantouedu.cn",
"age": 10
},
{
"id": "002",
"name": "Google",
"url": "www.google.com",
"age": 100
},
{
"id": "003",
"name": "淘宝",
"url": "www.taobao.com",
"age": 50
}
]
\ No newline at end of file
This source diff could not be displayed because it is too large. You can view the blob instead.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment