Skip to content
View lijinshan123's full-sized avatar

Block or report lijinshan123

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
42 stars written in Python
Clear filter

Command-line program to download videos from YouTube.com and other video sites

Python 134,065 10,196 Updated Feb 7, 2025

A feature-rich command-line audio/video downloader

Python 99,623 7,799 Updated Feb 9, 2025

⏬ Dumb downloader that scrapes the web

Python 54,628 9,702 Updated Jan 4, 2025

Scrapy, a fast high-level web crawling & scraping framework for Python.

Python 54,097 10,645 Updated Feb 6, 2025

TensorFlow code and pre-trained models for BERT

Python 38,619 9,656 Updated Jul 23, 2024

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

Python 34,382 10,340 Updated Jan 15, 2025

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫

Python 19,674 5,889 Updated Feb 8, 2025

TikTok 发布/喜欢/合辑/直播/视频/图集/音乐;抖音发布/喜欢/收藏/收藏夹/视频/图集/实况/直播/音乐/合集/评论/账号/搜索/热榜数据采集工具

Python 8,926 1,431 Updated Feb 6, 2025

all kinds of text classification models and more with deep learning

Python 7,888 2,570 Updated Sep 28, 2023

越来越多的网站具有反爬虫特性,有的用图片隐藏关键数据,有的使用反人类的验证码,建立反反爬虫的代码仓库,通过与不同特性的网站做斗争(无恶意)提高技术。(欢迎提交难以采集的网站)(因工作原因,项目暂停)

Python 7,284 2,174 Updated Oct 17, 2021

小红书(XiaoHongShu、RedNote)链接提取/作品采集工具:提取账号发布、收藏、点赞、专辑作品链接;提取搜索结果作品、用户链接;采集小红书作品信息;提取小红书作品下载地址;下载小红书无水印作品文件

Python 6,460 915 Updated Feb 7, 2025

Library for building WebSocket servers and clients in Python

Python 5,300 531 Updated Feb 9, 2025

CNN-RNN中文文本分类,基于TensorFlow

Python 4,183 1,469 Updated Mar 31, 2024

Lightweight, scriptable browser as a service with an HTTP API

Python 4,119 512 Updated Aug 2, 2024

A frida tool to dump dex in memory to support security engineers analyzing malware.

Python 4,102 911 Updated Mar 4, 2023

农业知识图谱(AgriKG):农业领域的信息检索,命名实体识别,关系抽取,智能问答,辅助决策

Python 4,085 1,572 Updated Jul 19, 2024

中文公开聊天语料库

Python 4,056 788 Updated Apr 23, 2024

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

Python 3,388 641 Updated Oct 29, 2024

A service daemon to run Scrapy spiders

Python 2,998 573 Updated Jan 31, 2025

use cnn recognize captcha by tensorflow. 本项目针对字符型图片验证码,使用tensorflow实现卷积神经网络,进行验证码识别。

Python 2,805 786 Updated Dec 8, 2022

CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

Python 2,761 1,080 Updated Oct 8, 2019

用于训练中英文对话系统的语料库 Datasets for Training Chatbot System

Python 2,039 496 Updated Sep 23, 2020

利用网络上公开的数据构建一个小型的证券知识图谱/知识库

Python 2,031 596 Updated Jul 23, 2020

The official Python SDK for Sentry.io

Python 1,963 521 Updated Feb 10, 2025

爬虫入门、爬虫进阶、高级爬虫

Python 1,911 258 Updated Nov 8, 2024

基于知识图谱的问答系统,BERT做命名实体识别和句子相似度,分为online和outline模式

Python 1,462 350 Updated Dec 16, 2021

小红书爬虫,小红书笔记、主页、搜索爬取

Python 1,197 208 Updated Jan 26, 2025

dgk_lost_conv 中文对白语料 chinese conversation corpus

Python 1,089 442 Updated May 6, 2021

🚁 保险行业语料库,聊天机器人

Python 1,020 343 Updated Jul 12, 2024

[Unmaintained] A simple and clean video/music/image downloader 👾

Python 815 141 Updated Mar 29, 2021
Next