Python 多线程读取 MongoDB 数据 - 提高效率的示例 - 常规

下面是一个示例代码，可以使用 Python 多线程读取 MongoDB 数据：

import pymongo
import threading

# 定义线程函数
def read_data(collection, start, end):
    for data in collection.find().skip(start).limit(end - start):
        print(data)

# 创建 MongoDB 连接
client = pymongo.MongoClient('mongodb://localhost:27017/')

# 获取数据库和集合
db = client['mydatabase']
collection = db['mycollection']

# 计算每个线程读取的数据量
total_data = collection.count_documents({})
num_threads = 4
batch_size = total_data // num_threads

# 创建线程列表
threads = []

# 启动线程
for i in range(num_threads):
    start = i * batch_size
    end = start + batch_size
    t = threading.Thread(target=read_data, args=(collection, start, end))
    t.start()
    threads.append(t)

# 等待所有线程结束
for t in threads:
    t.join()

# 关闭 MongoDB 连接
client.close()

此示例代码中，我们首先创建了一个 MongoDB 连接，并获取了要读取数据的数据库和集合。然后，我们计算了每个线程应该读取的数据量，并创建了一个线程列表。接下来，我们用循环启动了每个线程，并将其添加到线程列表中。最后，我们等待所有线程完成，并关闭了 MongoDB 连接。在线程函数 read_data 中，我们使用 find() 方法从集合中读取数据，并打印每个文档。我们使用 skip() 和 limit() 方法来控制每个线程读取的数据量。