设计一个电商/内容平台的搜索与推荐基础架构
1. 概述
1.1 搜索与推荐的重要性
搜索与推荐系统是电商和内容平台的核心功能,直接影响用户体验和平台收益:
搜索系统:
- 快速找到商品:帮助用户快速找到想要的商品
- 提升转化率:精准搜索提高购买转化率
- 用户体验:良好的搜索体验提升用户满意度
推荐系统:
- 个性化推荐:根据用户兴趣推荐内容
- 提升点击率:精准推荐提高内容点击率
- 增加停留时间:推荐相关内容增加用户停留时间
1.2 核心挑战
技术挑战:
- 搜索准确性:如何提高搜索结果的准确性
- 推荐个性化:如何实现个性化推荐
- 实时性:如何保证搜索和推荐的实时性
- 高并发:如何支撑高并发搜索和推荐请求
- 数据量大:如何处理海量商品和内容数据
1.3 本文内容结构
本文将从以下几个方面全面解析搜索与推荐基础架构:
- 搜索系统设计:搜索引擎、索引构建、搜索优化
- 推荐系统设计:推荐算法、特征工程、模型训练
- 架构设计:整体架构、模块设计
- 技术选型:搜索引擎、推荐框架
- 实现方案:完整实现代码
- 实战案例:电商搜索、内容推荐
2. 搜索系统设计
2.1 搜索引擎选型
2.1.1 主流搜索引擎
Elasticsearch:
- 优势:功能强大、生态完善、易于使用
- 适用场景:全文搜索、日志分析、数据检索
Solr:
- 优势:成熟稳定、功能丰富
- 适用场景:企业级搜索、内容搜索
OpenSearch:
- 优势:Elasticsearch的开源分支
- 适用场景:替代Elasticsearch
推荐:Elasticsearch(功能强大、生态完善)
2.1.2 Elasticsearch特点
核心特性:
- 分布式:支持集群部署
- 全文搜索:强大的全文搜索能力
- 实时性:近实时搜索
- 可扩展:水平扩展
- RESTful API:易于集成
2.2 索引设计
2.2.1 商品索引
商品数据结构:
1 2 3 4 5 6 7 8 9 10 11 12
| public class Product { private Long productId; private String title; private String description; private String category; private String brand; private Double price; private Integer sales; private Double rating; private List<String> tags; private Date createTime; }
|
Elasticsearch Mapping:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
| { "mappings": { "properties": { "productId": { "type": "long" }, "title": { "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_smart", "fields": { "keyword": { "type": "keyword" } } }, "description": { "type": "text", "analyzer": "ik_max_word" }, "category": { "type": "keyword" }, "brand": { "type": "keyword" }, "price": { "type": "double" }, "sales": { "type": "integer" }, "rating": { "type": "double" }, "tags": { "type": "keyword" }, "createTime": { "type": "date" } } } }
|
2.2.2 索引构建
索引服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
| @Service public class SearchIndexService { @Autowired private ElasticsearchRestTemplate elasticsearchTemplate;
public void createProductIndex() { elasticsearchTemplate.indexOps(Product.class).create(); }
public void indexProduct(Product product) { elasticsearchTemplate.save(product); }
public void bulkIndexProducts(List<Product> products) { List<IndexQuery> queries = products.stream() .map(product -> new IndexQueryBuilder() .withId(product.getProductId().toString()) .withObject(product) .build()) .collect(Collectors.toList()); elasticsearchTemplate.bulkIndex(queries, BulkOptions.defaultOptions()); }
public void updateProduct(Product product) { elasticsearchTemplate.save(product); }
public void deleteProduct(Long productId) { elasticsearchTemplate.delete(productId.toString(), Product.class); } }
|
2.3 搜索实现
2.3.1 搜索服务
搜索服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
| @Service public class SearchService { @Autowired private ElasticsearchRestTemplate elasticsearchTemplate;
public SearchResult<Product> searchProducts(SearchRequest request) { BoolQueryBuilder boolQuery = QueryBuilders.boolQuery(); if (StringUtils.isNotBlank(request.getKeyword())) { boolQuery.must(QueryBuilders.multiMatchQuery(request.getKeyword(), "title^3", "description^1") .type(MultiMatchQueryBuilder.Type.BEST_FIELDS) .fuzziness(Fuzziness.AUTO)); } if (StringUtils.isNotBlank(request.getCategory())) { boolQuery.filter(QueryBuilders.termQuery("category", request.getCategory())); } if (StringUtils.isNotBlank(request.getBrand())) { boolQuery.filter(QueryBuilders.termQuery("brand", request.getBrand())); } if (request.getMinPrice() != null || request.getMaxPrice() != null) { RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("price"); if (request.getMinPrice() != null) { rangeQuery.gte(request.getMinPrice()); } if (request.getMaxPrice() != null) { rangeQuery.lte(request.getMaxPrice()); } boolQuery.filter(rangeQuery); } NativeSearchQueryBuilder queryBuilder = new NativeSearchQueryBuilder() .withQuery(boolQuery) .withPageable(PageRequest.of(request.getPage() - 1, request.getPageSize())) .withSort(SortBuilders.fieldSort("sales").order(SortOrder.DESC)) .withSort(SortBuilders.fieldSort("rating").order(SortOrder.DESC)); queryBuilder.withHighlightFields( new HighlightBuilder.Field("title"), new HighlightBuilder.Field("description") ); SearchHits<Product> searchHits = elasticsearchTemplate.search( queryBuilder.build(), Product.class); List<Product> products = searchHits.stream() .map(SearchHit::getContent) .collect(Collectors.toList()); SearchResult<Product> result = new SearchResult<>(); result.setProducts(products); result.setTotal(searchHits.getTotalHits()); result.setPage(request.getPage()); result.setPageSize(request.getPageSize()); return result; } }
|
2.3.2 搜索优化
相关性排序:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| public SearchResult<Product> searchProductsWithRelevance(SearchRequest request) { FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery( QueryBuilders.multiMatchQuery(request.getKeyword(), "title^3", "description^1"), ScoreFunctionBuilders.fieldValueFactorFunction("sales") .modifier(FieldValueFactorFunction.Modifier.LOG1P) .factor(0.1f), ScoreFunctionBuilders.fieldValueFactorFunction("rating") .modifier(FieldValueFactorFunction.Modifier.LOG1P) .factor(0.2f) ).boostMode(CombineFunction.SUM); NativeSearchQueryBuilder queryBuilder = new NativeSearchQueryBuilder() .withQuery(functionScoreQuery) .withPageable(PageRequest.of(request.getPage() - 1, request.getPageSize())); SearchHits<Product> searchHits = elasticsearchTemplate.search( queryBuilder.build(), Product.class); return buildSearchResult(searchHits, request); }
|
搜索建议:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| @Service public class SearchSuggestionService { @Autowired private ElasticsearchRestTemplate elasticsearchTemplate;
public List<String> getSuggestions(String keyword) { CompletionSuggestionBuilder suggestionBuilder = SuggestBuilders.completionSuggestion("suggest") .prefix(keyword) .size(10); SearchRequest searchRequest = new SearchRequest("products"); searchRequest.source().suggest(new SuggestBuilder().addSuggestion("suggestions", suggestionBuilder)); SearchResponse response = elasticsearchTemplate.getClient().search(searchRequest, RequestOptions.DEFAULT); CompletionSuggestion suggestion = response.getSuggest().getSuggestion("suggestions"); return suggestion.getEntries().stream() .flatMap(entry -> entry.getOptions().stream()) .map(option -> option.getText().string()) .collect(Collectors.toList()); } }
|
2.4 搜索优化
2.4.1 分词优化
IK分词器配置:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| { "settings": { "analysis": { "analyzer": { "ik_max_word": { "type": "ik_max_word" }, "ik_smart": { "type": "ik_smart" }, "custom_analyzer": { "type": "custom", "tokenizer": "ik_max_word", "filter": ["lowercase", "stop"] } } } } }
|
2.4.2 搜索缓存
Redis缓存:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| @Service public class SearchService { @Autowired private RedisTemplate<String, String> redisTemplate; public SearchResult<Product> searchProducts(SearchRequest request) { String cacheKey = "search:" + request.getKeyword() + ":" + request.getPage(); String cached = redisTemplate.opsForValue().get(cacheKey); if (cached != null) { return JSON.parseObject(cached, SearchResult.class); } SearchResult<Product> result = doSearch(request); redisTemplate.opsForValue().set(cacheKey, JSON.toJSONString(result), 5, TimeUnit.MINUTES); return result; } }
|
3. 推荐系统设计
3.1 推荐算法
3.1.1 协同过滤
基于用户的协同过滤(User-Based CF):
基于物品的协同过滤(Item-Based CF):
3.1.2 内容推荐
内容推荐:
3.1.3 深度学习推荐
深度学习推荐:
3.2 特征工程
3.2.1 用户特征
用户特征:
- 用户ID
- 年龄、性别
- 历史行为(浏览、购买、收藏)
- 偏好标签
3.2.2 商品特征
商品特征:
3.2.3 交互特征
交互特征:
- 用户-商品交互(点击、购买、收藏)
- 时间特征
- 上下文特征
3.3 推荐实现
3.3.1 协同过滤实现
协同过滤服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136
| @Service public class CollaborativeFilteringService { @Autowired private UserBehaviorMapper userBehaviorMapper;
public List<Long> recommendByUser(Long userId, int topN) { List<UserBehavior> userBehaviors = userBehaviorMapper.selectByUserId(userId); List<Long> similarUsers = findSimilarUsers(userId, userBehaviors); Set<Long> recommendedProducts = new HashSet<>(); for (Long similarUserId : similarUsers) { List<UserBehavior> similarUserBehaviors = userBehaviorMapper.selectByUserId(similarUserId); for (UserBehavior behavior : similarUserBehaviors) { if (!hasUserInteracted(userId, behavior.getProductId())) { recommendedProducts.add(behavior.getProductId()); } } } return calculateRecommendationScore(userId, recommendedProducts) .stream() .sorted((a, b) -> Double.compare(b.getScore(), a.getScore())) .limit(topN) .map(Recommendation::getProductId) .collect(Collectors.toList()); }
public List<Long> recommendByItem(Long userId, int topN) { List<UserBehavior> userBehaviors = userBehaviorMapper.selectByUserId(userId); List<Long> likedProducts = userBehaviors.stream() .map(UserBehavior::getProductId) .collect(Collectors.toList()); Set<Long> recommendedProducts = new HashSet<>(); for (Long likedProductId : likedProducts) { List<Long> similarProducts = findSimilarProducts(likedProductId); for (Long similarProductId : similarProducts) { if (!likedProducts.contains(similarProductId)) { recommendedProducts.add(similarProductId); } } } return calculateRecommendationScore(userId, recommendedProducts) .stream() .sorted((a, b) -> Double.compare(b.getScore(), a.getScore())) .limit(topN) .map(Recommendation::getProductId) .collect(Collectors.toList()); }
private List<Long> findSimilarUsers(Long userId, List<UserBehavior> userBehaviors) { Map<Long, Double> userSimilarities = new HashMap<>(); List<Long> allUserIds = userBehaviorMapper.selectAllUserIds(); for (Long otherUserId : allUserIds) { if (otherUserId.equals(userId)) { continue; } List<UserBehavior> otherUserBehaviors = userBehaviorMapper.selectByUserId(otherUserId); double similarity = calculateCosineSimilarity(userBehaviors, otherUserBehaviors); if (similarity > 0.1) { userSimilarities.put(otherUserId, similarity); } } return userSimilarities.entrySet().stream() .sorted((a, b) -> Double.compare(b.getValue(), a.getValue())) .limit(10) .map(Map.Entry::getKey) .collect(Collectors.toList()); }
private double calculateCosineSimilarity(List<UserBehavior> behaviors1, List<UserBehavior> behaviors2) { Set<Long> productIds = new HashSet<>(); behaviors1.forEach(b -> productIds.add(b.getProductId())); behaviors2.forEach(b -> productIds.add(b.getProductId())); Map<Long, Double> vector1 = new HashMap<>(); Map<Long, Double> vector2 = new HashMap<>(); behaviors1.forEach(b -> vector1.put(b.getProductId(), b.getScore())); behaviors2.forEach(b -> vector2.put(b.getProductId(), b.getScore())); double dotProduct = 0.0; double norm1 = 0.0; double norm2 = 0.0; for (Long productId : productIds) { double v1 = vector1.getOrDefault(productId, 0.0); double v2 = vector2.getOrDefault(productId, 0.0); dotProduct += v1 * v2; norm1 += v1 * v1; norm2 += v2 * v2; } if (norm1 == 0.0 || norm2 == 0.0) { return 0.0; } return dotProduct / (Math.sqrt(norm1) * Math.sqrt(norm2)); } }
|
3.3.2 热门推荐
热门推荐服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
| @Service public class PopularRecommendationService { @Autowired private ProductMapper productMapper;
public List<Long> recommendPopular(int topN) { return productMapper.selectTopBySales(topN) .stream() .map(Product::getProductId) .collect(Collectors.toList()); }
public List<Long> recommendLatest(int topN) { return productMapper.selectTopByCreateTime(topN) .stream() .map(Product::getProductId) .collect(Collectors.toList()); }
public List<Long> recommendHighRating(int topN) { return productMapper.selectTopByRating(topN) .stream() .map(Product::getProductId) .collect(Collectors.toList()); } }
|
3.3.3 混合推荐
混合推荐服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
| @Service public class HybridRecommendationService { @Autowired private CollaborativeFilteringService cfService; @Autowired private PopularRecommendationService popularService;
public List<Long> hybridRecommend(Long userId, int topN) { List<Long> cfRecommendations = cfService.recommendByUser(userId, topN * 2); List<Long> popularRecommendations = popularService.recommendPopular(topN); List<Long> contentRecommendations = contentService.recommendByContent(userId, topN); Map<Long, Double> recommendationScores = new HashMap<>(); for (int i = 0; i < cfRecommendations.size(); i++) { Long productId = cfRecommendations.get(i); double score = (topN * 2 - i) * 0.6; recommendationScores.put(productId, recommendationScores.getOrDefault(productId, 0.0) + score); } for (int i = 0; i < popularRecommendations.size(); i++) { Long productId = popularRecommendations.get(i); double score = (topN - i) * 0.3; recommendationScores.put(productId, recommendationScores.getOrDefault(productId, 0.0) + score); } for (int i = 0; i < contentRecommendations.size(); i++) { Long productId = contentRecommendations.get(i); double score = (topN - i) * 0.1; recommendationScores.put(productId, recommendationScores.getOrDefault(productId, 0.0) + score); } return recommendationScores.entrySet().stream() .sorted((a, b) -> Double.compare(b.getValue(), a.getValue())) .limit(topN) .map(Map.Entry::getKey) .collect(Collectors.toList()); } }
|
4. 架构设计
4.1 整体架构
4.1.1 架构图
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| 用户请求 ↓ API网关(路由 + 鉴权) ↓ ├──→ 搜索服务(商品搜索) └──→ 推荐服务(商品推荐) ↓ ├──→ Elasticsearch(搜索引擎) ├──→ Redis(缓存、用户行为) ├──→ MySQL(商品数据、用户数据) ├──→ Kafka(用户行为采集) └──→ 推荐模型服务(模型推理) ↓ 数据采集 → 特征工程 → 模型训练 → 模型部署
|
4.1.2 架构说明
接入层:
服务层:
- 搜索服务:商品搜索、搜索优化
- 推荐服务:个性化推荐、热门推荐
数据层:
- Elasticsearch:搜索引擎、索引存储
- Redis:缓存、用户行为数据
- MySQL:商品数据、用户数据
- Kafka:用户行为采集
算法层:
- 特征工程:特征提取、特征处理
- 模型训练:模型训练、模型评估
- 模型服务:模型推理、实时推荐
4.2 数据流
4.2.1 搜索数据流
1
| 商品数据 → 数据同步 → Elasticsearch索引 → 搜索服务 → 用户
|
4.2.2 推荐数据流
1
| 用户行为 → Kafka → 特征工程 → 模型训练 → 模型服务 → 推荐服务 → 用户
|
5. 用户行为采集
5.1 行为数据模型
5.1.1 行为类型
行为类型:
- 浏览(VIEW):查看商品详情
- 点击(CLICK):点击商品
- 购买(PURCHASE):购买商品
- 收藏(FAVORITE):收藏商品
- 分享(SHARE):分享商品
5.1.2 行为数据模型
行为数据结构:
1 2 3 4 5 6 7
| public class UserBehavior { private Long userId; private Long productId; private String behaviorType; private Long timestamp; private String context; }
|
5.2 行为采集
5.2.1 前端埋点
JavaScript埋点:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| function trackBehavior(behaviorType, productId, context) { const behavior = { userId: getUserId(), productId: productId, behaviorType: behaviorType, timestamp: Date.now(), context: context }; fetch('/api/behavior/track', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(behavior) }); }
trackBehavior('VIEW', productId, { page: 'product_detail', source: 'search' });
trackBehavior('CLICK', productId, { position: 'search_result', rank: 3 });
trackBehavior('PURCHASE', productId, { orderId: orderId, amount: amount });
|
5.2.2 后端采集
行为采集服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| @RestController @RequestMapping("/api/behavior") public class BehaviorController { @Autowired private KafkaTemplate<String, String> kafkaTemplate; @PostMapping("/track") public Response<String> trackBehavior(@RequestBody UserBehavior behavior) { try { kafkaTemplate.send("user-behavior", behavior.getUserId().toString(), JSON.toJSONString(behavior)); return Response.success("采集成功"); } catch (Exception e) { log.error("行为采集失败", e); return Response.error("采集失败"); } } }
|
5.2.3 Kafka消费
行为数据处理:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| @Component public class BehaviorConsumer { @Autowired private RedisTemplate<String, String> redisTemplate; @KafkaListener(topics = "user-behavior", groupId = "behavior-processor") public void processBehavior(String userId, String message) { try { UserBehavior behavior = JSON.parseObject(message, UserBehavior.class); storeBehaviorToRedis(behavior); storeBehaviorToDatabase(behavior); updateUserProfile(behavior); } catch (Exception e) { log.error("处理行为数据失败", e); } } private void storeBehaviorToRedis(UserBehavior behavior) { String key = "user:behavior:" + behavior.getUserId(); redisTemplate.opsForList().leftPush(key, JSON.toJSONString(behavior)); redisTemplate.opsForList().trim(key, 0, 999); redisTemplate.expire(key, 7, TimeUnit.DAYS); } }
|
6. 特征工程
6.1 特征提取
6.1.1 用户特征
用户特征提取:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| @Service public class FeatureService {
public UserFeatures extractUserFeatures(Long userId) { UserFeatures features = new UserFeatures(); User user = userService.getUser(userId); features.setAge(user.getAge()); features.setGender(user.getGender()); List<UserBehavior> behaviors = userBehaviorMapper.selectByUserId(userId); features.setViewCount(behaviors.stream() .filter(b -> "VIEW".equals(b.getBehaviorType())) .count()); features.setPurchaseCount(behaviors.stream() .filter(b -> "PURCHASE".equals(b.getBehaviorType())) .count()); Map<String, Integer> categoryPrefs = new HashMap<>(); behaviors.forEach(b -> { Product product = productService.getProduct(b.getProductId()); categoryPrefs.put(product.getCategory(), categoryPrefs.getOrDefault(product.getCategory(), 0) + 1); }); features.setCategoryPreferences(categoryPrefs); return features; } }
|
6.1.2 商品特征
商品特征提取:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| @Service public class FeatureService {
public ProductFeatures extractProductFeatures(Long productId) { ProductFeatures features = new ProductFeatures(); Product product = productService.getProduct(productId); features.setCategory(product.getCategory()); features.setBrand(product.getBrand()); features.setPrice(product.getPrice()); features.setSales(product.getSales()); features.setRating(product.getRating()); return features; } }
|
6.2 特征存储
6.2.1 特征存储
特征存储服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| @Service public class FeatureStoreService { @Autowired private RedisTemplate<String, String> redisTemplate;
public void storeUserFeatures(Long userId, UserFeatures features) { String key = "features:user:" + userId; redisTemplate.opsForValue().set(key, JSON.toJSONString(features), 1, TimeUnit.DAYS); }
public UserFeatures getUserFeatures(Long userId) { String key = "features:user:" + userId; String value = redisTemplate.opsForValue().get(key); if (value != null) { return JSON.parseObject(value, UserFeatures.class); } UserFeatures features = featureService.extractUserFeatures(userId); storeUserFeatures(userId, features); return features; } }
|
7. 模型训练与部署
7.1 模型训练
7.1.1 训练数据准备
训练数据:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
| @Service public class TrainingDataService {
public List<TrainingSample> prepareTrainingData() { List<TrainingSample> samples = new ArrayList<>(); List<UserBehavior> behaviors = userBehaviorMapper.selectAll(); for (UserBehavior behavior : behaviors) { UserFeatures userFeatures = featureService.extractUserFeatures(behavior.getUserId()); ProductFeatures productFeatures = featureService.extractProductFeatures(behavior.getProductId()); TrainingSample sample = new TrainingSample(); sample.setUserFeatures(userFeatures); sample.setProductFeatures(productFeatures); sample.setLabel(calculateLabel(behavior)); samples.add(sample); } return samples; } private int calculateLabel(UserBehavior behavior) { if ("PURCHASE".equals(behavior.getBehaviorType()) || "FAVORITE".equals(behavior.getBehaviorType())) { return 1; } return 0; } }
|
7.1.2 模型训练
模型训练服务(使用Python + TensorFlow示例):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| import tensorflow as tf from tensorflow import keras
def build_model(): model = keras.Sequential([ keras.layers.Dense(128, activation='relu', input_shape=(feature_dim,)), keras.layers.Dropout(0.2), keras.layers.Dense(64, activation='relu'), keras.layers.Dropout(0.2), keras.layers.Dense(1, activation='sigmoid') ]) model.compile( optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'] ) return model
def train_model(): X_train, y_train = load_training_data() model = build_model() model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2) model.save('recommendation_model.h5') return model
|
7.2 模型服务
7.2.1 模型部署
模型服务(使用TensorFlow Serving):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| @Service public class ModelService { private TfServingClient client; @PostConstruct public void init() { client = new TfServingClient("http://model-server:8501"); }
public double predict(Long userId, Long productId) { UserFeatures userFeatures = featureService.extractUserFeatures(userId); ProductFeatures productFeatures = featureService.extractProductFeatures(productId); double[] features = buildFeatureVector(userFeatures, productFeatures); double score = client.predict(features); return score; } }
|
8. 实战案例
8.1 电商搜索
8.1.1 搜索接口
搜索接口:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| @RestController @RequestMapping("/api/search") public class SearchController { @Autowired private SearchService searchService; @GetMapping("/products") public Response<SearchResult<Product>> searchProducts( @RequestParam String keyword, @RequestParam(required = false) String category, @RequestParam(required = false) String brand, @RequestParam(required = false) Double minPrice, @RequestParam(required = false) Double maxPrice, @RequestParam(defaultValue = "1") Integer page, @RequestParam(defaultValue = "20") Integer pageSize) { SearchRequest request = new SearchRequest(); request.setKeyword(keyword); request.setCategory(category); request.setBrand(brand); request.setMinPrice(minPrice); request.setMaxPrice(maxPrice); request.setPage(page); request.setPageSize(pageSize); SearchResult<Product> result = searchService.searchProducts(request); return Response.success(result); } }
|
8.2 内容推荐
8.2.1 推荐接口
推荐接口:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| @RestController @RequestMapping("/api/recommend") public class RecommendController { @Autowired private HybridRecommendationService recommendationService; @GetMapping("/products") public Response<List<Product>> recommendProducts( @RequestParam Long userId, @RequestParam(defaultValue = "20") Integer topN) { List<Long> productIds = recommendationService.hybridRecommend(userId, topN); List<Product> products = productService.getProducts(productIds); return Response.success(products); } }
|
9. 性能优化
9.1 搜索优化
9.1.1 索引优化
索引优化:
9.1.2 查询优化
查询优化:
9.2 推荐优化
9.2.1 缓存优化
推荐结果缓存:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| @Service public class RecommendationService { @Autowired private RedisTemplate<String, String> redisTemplate; public List<Long> recommend(Long userId, int topN) { String cacheKey = "recommend:" + userId + ":" + topN; String cached = redisTemplate.opsForValue().get(cacheKey); if (cached != null) { return JSON.parseArray(cached, Long.class); } List<Long> recommendations = doRecommend(userId, topN); redisTemplate.opsForValue().set(cacheKey, JSON.toJSONString(recommendations), 1, TimeUnit.HOURS); return recommendations; } }
|
9.2.2 异步计算
异步推荐计算:
1 2 3 4 5 6 7 8 9 10
| @Service public class RecommendationService { @Async public CompletableFuture<List<Long>> recommendAsync(Long userId, int topN) { List<Long> recommendations = doRecommend(userId, topN); return CompletableFuture.completedFuture(recommendations); } }
|
10. 总结
10.1 核心要点
- 搜索系统:Elasticsearch、索引设计、搜索优化
- 推荐系统:协同过滤、内容推荐、混合推荐
- 用户行为:行为采集、行为分析
- 特征工程:特征提取、特征存储
- 模型训练:模型训练、模型部署
10.2 关键设计
- 搜索引擎:Elasticsearch全文搜索
- 推荐算法:协同过滤、内容推荐、混合推荐
- 行为采集:Kafka异步采集
- 特征工程:用户特征、商品特征
- 模型服务:TensorFlow Serving模型服务
10.3 最佳实践
- 搜索优化:索引优化、查询优化、缓存优化
- 推荐优化:结果缓存、异步计算
- 数据采集:实时采集、异步处理
- 模型训练:定期训练、模型更新
相关文章: