Discussion on content distribution natural languag

  • Detail

Talk about: AI of content distribution and natural language dialogue NLP in YY live broadcast platform AI

talk about: AI of content distribution and natural language dialogue NLP in YY live broadcast platform AI

10:06 source://

original title: talk about: AI of content distribution and natural language dialogue NLP in YY live broadcast platform AI

this article is about the application of AI of content distribution and natural language dialogue in live broadcast platform. It mainly focuses on its definition, function and implementation, and also throws out relevant information and cases for analysis. This is also a summary of some questions in the third round of the six rounds of interviews with YY AI product managers. Of course, the latter also successfully won the offer. (some contents have been removed due to the relationship of interests) if you are interested in the relevant interview and preparation of AI product manager, you are welcome to pay attention to the collection and keep updating

the contents of this article are as follows:

1 AI application of content distribution in live broadcast platform

1.1 audit and supervision of content

1.1.1 first: Problems and risks

1.1.2 look again: traditional solutions

1.1.3 application improvement: ai

1.2 personalized content distribution

1.2.1 background

1.2.2 use AI to mine content characteristics

2 AI application of natural language dialogue in live broadcast platform

2.1 customer service robot

2.1.1 status quo

2.1.2 dialogue increases emotional analysis

2.1.3 dialogue enhances self-learning

2.14 dialogue enhances intention analysis, context analysis

2.15 other

2.2 voice assistant

2.3 live broadcast room assistance

2.3.1 scenario 1: language expression defects <

2.3.2 Scenario 2: when the voice in the live broadcast environment is unavailable

2.3.3 scenario 3: live studio auxiliary

3 Summary

1 AI application of content distribution in live broadcast platform

first, according to the content of live broadcast platform, it can be broadly divided into three types: content creator (anchor), live video, and small video

secondly, from the "in and out" of content, it can be divided into two aspects: one is content audit and supervision, and the other is personalized content distribution

finally, the production and creation of content (this article will not discuss it first, and will be issued separately later)

1.1 audit and supervision of content

this scenario focuses on: scientific management and control, improving efficiency, and reducing costs

1.1.1 let's start with: Problems and risks

(1) the monitoring of live broadcast content is complex, and it is easy to miss judgment manually.

there are many types of live broadcasts of violations, such as pornography, advertising, infringement, gambling, violence, politics, sensitivity, screen in screen, etc. manual or standardized audit models are difficult to accurately identify, and the probability of misjudgment and omission is high

(2) the scale of online live broadcast is huge, and the cost of manual audit is high.

the supervision requires 24-hour real-time implementation. Although the proportion of violations is not high, in order to achieve "fish without leakage", a lot of human, material and financial resources need to be invested in supervision, and the pressure on operating costs increases

(3) the live broadcast traffic focuses on the night, and the manual audit efficiency is low.

the night is tired, the accuracy of human eye recognition is reduced, the probability of misjudgment and missed judgment is increased, and the audit efficiency is reduced, which makes it difficult to meet the content supervision requirements of the live broadcast

(4) it is difficult to verify the real name of the anchor and live broadcast in real time.

first, the real name verification of the anchor registration, if it completely depends on human audit, the human cost will increase, and it is difficult to achieve a true and effective audit; Second, every time the anchor broadcasts live, he verifies whether he is broadcasting live, which increases part of the labor cost and makes it more difficult to operate

1.1.2 look again: traditional solutions

there are three traditional audit methods:

pure manual audit: personnel work in "three shifts", and the human eye identifies whether the picture or video is illegal

establish MD5 database: supervise the establishment of MD5 database for storing illegal pictures and videos, and automatically analyze whether MD5 is legal after uploading, so as to avoid repeated sharing of pornographic content

traditional intelligent audit:

disadvantages: these audit methods have large vulnerabilities. "Three shifts" of labor is easy to lead to subjective problems such as low audit efficiency and many false and missed judgments; MD5 is very easy to be tampered with; The accuracy of traditional intelligent recognition of pornographic pictures is low and often false positives. At the same time, it is more difficult to meet the demand for live video audit, which has been popular in the past two years

1.1.3 application improvement: AI

based on the above scenarios and problems, AI technology can be introduced for landing optimization

(1) scheme: using the mode of AI recognition + manual review

(2) scenario: for example, "yellow identification": the type of content analysis through the Yellow identification model is "pornographic", "sexy" and "normal", and the machine will automatically divide the identification results into two parts: confirmation and review. The identification accuracy of the confirmation part reaches or exceeds that of manual work, without review. For the review part, the machine will sort according to the possibility, The audit is conducted manually according to the probability from high to low

similarly, it can be extended to the review tasks such as the anchor cover map

1.2 personalized content distribution

focus: the recommended results of live video have a greater coincidence with users' psychological expectations, users' choices when watching live video are more intuitive and accurate, users' activity is significantly improved, and the click through rate and retention rate of the platform are significantly increased

1.2.1 background

in the era of big data, personalized recommendation has become a standard configuration for e-commerce and content-based products, and the application benefits are not repeated

the content distribution here refers to the personalized content sorting of the front end, that is, personalized recommendation

at present, the mainstream recommendation algorithm is collaborative filtering, and the recommendation engine is a combination of multiple recommendation algorithms. We will not carry out too much at the algorithm level here, but focus on: no matter what algorithm engine it is, it is calculated based on user portrait and goods (content) portrait. Without these basic features, it is difficult to carry out personalized content distribution

at the same time, the interests and behavior preferences in user portraits are often mapped by content portraits. For example, user a reads an article entitled "Jay Chou's latest concert is scheduled!", It may be labeled with "entertainment preference", "Jay Chou" and other labels, and different weight scores will be given through different behaviors. Therefore, how to identify the characteristics of content affects the effect, efficiency and experience of personalized content distribution

1.2.2 use AI to mine content features

here we only analyze live video

using AI technology to analyze and understand live video from the four dimensions of face, image, music and language, we can make basic classification and characterization

(1) from the perspective of content, features

from the perspective of content recognition difficulty, live video is more difficult than articles and pictures

from the perspective of content characteristics, live broadcast has interactivity, scene, talent types, and even "anchor" tone

from the perspective of content timeliness, the live broadcast is real-time


(2) application process

first, according to the above characteristics, the content of live video should be divided into multiple dimensions, and AI is used to identify the content and mine rich content features

secondly, because the live broadcast is real-time, it is produced by the anchor, and there is the anchor first and then the live video content. Therefore, the characteristics of the live broadcast content here should be marked for the anchor

finally, we can consider a kind of personality. For example, the twisted steel content distribution is: user characteristics + anchor characteristics, enter the recommendation engine to calculate, and recall a batch of "anchor weight lists" that meet user preferences. At this time, check whether these anchors have opened live broadcasts, how long the live broadcasts have been started, the degree of intimacy and other dimensions to make comprehensive recommendations, so as to achieve personalized content distribution

(3) AI recognition dimension, content distribution ranking

list some characteristic dimensions of content recognition (for example, incomplete):

from the live broadcast:

interactivity: the interaction between the anchor and fans, including language communication feedback, question and answer, etc

reward: reward and brush gifts, which directly reflects the income status. Sometimes it is necessary to consider the interference of business, which is conducive to the commission income of the platform

duration: duration of live broadcast

scene: live broadcast scene, outdoor or indoor. Such as live broadcasting room, sports field, gym, car, etc

objects: goods, decorations, beauty products, etc

departure: whether there is departure, frequency, etc

from the anchor point of view:

behavior: singing, chatting, performing, magic, commentary, multi gesture...

voice: Sweet department, female Man Department, healing Department

talent: playing the piano, singing

style: Korean dress, sexy, mature, gentleman, sports sunshine, cute girl...

gender: male, female...

age: age range, or after XX, depending on the model caliber

appearance value:

for example:

style: a young female anchor who likes to make a beep expression is likely to be labeled as a "cute girl" (face recognition)

face value: judge the anchor's face value score based on the face value model (face recognition)

voice: judge the anchor's voice based on the model (speech recognition)

gesture: an anchor often likes to make a coherent gesture body action such as heart comparison, love, mouth, etc., and may be labeled with gestures, adults, love interaction, etc. (dynamic gesture recognition)

the above recognition basically needs to be combined with AI technology, Mining content features for recommendation

(4) sorting of personalized content distribution algorithm

own YY basic dimensions:

(5) other

personalized content distribution can be used in multiple scenarios, such as search results pages, home pages, attention pages, etc., and personalized recommendation strategies based on different user groups. In addition, in addition to the application of AI in content feature recognition, deep learning and knowledge mapping in recommendation engine are also supported by AI

2. AI application of natural language dialogue in live broadcast platform

first of all, NLP involves many fields, and it may be used in any scene with natural language (voice, text) input. Such as semantic analysis, machine translation, etc. The natural language dialogue here refers to Intelligent Assistant/intelligent question and answer/voice service, etc. My personal understanding is to use AI technology to create a unified Cui (dialogue interactive interface) and one-stop integrated information service

secondly, according to the scene of the dialogue, there are closed domain dialogue and open domain dialogue. Generally speaking, the former is "require the user to input the specified words to continue the conversation", and the input and output are enumerable, with a clear beginning and end; The latter is "what users like to say can be a continuous dialogue", the input and output can not be exhausted, there is no clear process

if there are text and voice according to the dialogue content. (generally, words can be processed directly. If it is voice, it is usually necessary to convert voice into text (ASR Technology)

the general principle is: it is user input. Through long-term accumulated knowledge in the engine, first through natural language analysis, and then through Bayer materials technology, relying on its polyurethane and neoprene dispersion technical meaning understanding and context analysis for knowledge reasoning, So as to generate personalized answers and output them to users. The whole typical natural language dialogue is shown in the figure below

finally, we analyze several scenarios, including customer service robot, voice assistant, and live studio assistant

2.1 customer service

Copyright © 2011 JIN SHI