US20130173611A1 - Generation of nickname dictionary - Google Patents
Generation of nickname dictionary Download PDFInfo
- Publication number
- US20130173611A1 US20130173611A1 US13/779,574 US201313779574A US2013173611A1 US 20130173611 A1 US20130173611 A1 US 20130173611A1 US 201313779574 A US201313779574 A US 201313779574A US 2013173611 A1 US2013173611 A1 US 2013173611A1
- Authority
- US
- United States
- Prior art keywords
- messages
- name
- word
- words
- directed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G06F17/3053—
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Definitions
- the present disclosure generally relates to data mining and, more particularly, to mining of information in user communications to develop a nickname dictionary.
- Social networks, or social utilities that enable connections between users have become prevalent in recent years.
- social network systems allow users to communicate information very efficiently. For example, a user may post contact information, background information, job information, hobbies, and/or other user-specific data to a location associated with the user on a social network system. Other users can then review the posted data by browsing user profiles or searching for profiles including specific data. Users may also post messages directly on user profiles and send messages to a private inbox.
- the social network systems also allow users to associate themselves with other users, thus creating a web of connections among the users of the social network system.
- Searching for users of a social network system typically involves composing a search query including forenames and surnames and submitting it to a search system.
- users adopt nicknames as user names which can be problematic for user searches when a searching user only has forename and surname information.
- some searchers encounter the opposite problem, failing to find a user by a common nickname because the account is under the full forename.
- nicknames that a given individual chooses to adopt may change over time, presenting similar problems even when a searching user remembers a nickname.
- the present invention is directed to methods, apparatuses and systems directed to generating a nickname dictionary that includes associations between names of users and nicknames based on statistical analysis of user communications observed at a network communications facility, such as a social network system, an email provider and the like.
- a nickname generating process analyzes user communications to develop a nickname dictionary that includes a plurality of entries, each entry identifying a name, a nickname and a confidence score indicating a degree of confidence in the nickname.
- implementations of the invention leverage communications between users to learn nicknames.
- a variety of functions can leverage the resulting nickname dictionary, such as search engines for locating users or user profiles (such as search query suggestions, search query expansion, result ranking), and registration processes (such as username suggestions and data field seeding).
- FIG. 1 is a schematic diagram of a computer network environment, in which particular embodiments of the present invention may operate.
- FIG. 2 is a flow chart setting forth an example process according to one implementation of the invention.
- FIG. 3 is a functional block diagram illustrating an example network device hardware system architecture.
- a social network system offers its users the ability to communicate and interact with other users of the website.
- users join the social network system and then add connections to a number of other users to whom they desire to be connected.
- the term “friend” refers to any other user to whom a user has formed a connection, association, or relationship via the website. Connections may be added explicitly by a user—if for example, the user selects another user as a friend—or automatically created by the social network system based on common characteristics of the users (e.g., users who are alumni of the same educational institution). Connections in social network systems are usually in both directions, but need not be, so the terms “user” and “friend” depend on the frame of reference.
- connection between users may be a direct connection; however, some embodiments of a social network system allow the connection to be indirect via one or more levels of connections.
- friend need not require that users actually be friends in real life (which would generally be the case when one of the users is a business or other entity); it simply implies a connection in the social network system.
- a user of the social network system may be any suitable entity, such as an individual, a corporation, a partnership, a joint venture, and combinations of the foregoing.
- the social network system 20 maintains one or more network communications facilities that provide users with the ability to communicate with other users. Some types of actions include “friend requesting,” “wall posting,” and “sending a message.” Upon acceptance of a friend request, the requestor and requestee become friends. Friends may access more information about each other's profile than other non-friend users.
- a wall post allows users to post a message to a target user's wall. The wall is a forum for comments or insights about another user or a given topic and typically appears on a user's profile page. Typically, a first user can create a wall post on a target user's wall, to which other users, including the target user, may add messages to form a message thread.
- the social network system may also maintain some form of private message communications facility, such as intra- and inter-domain electronic mail, that users access by navigating to a private inbox.
- FIG. 1 illustrates an example network environment, in which embodiments of the invention may operate.
- Network cloud 60 generally represents one or more interconnected networks, over which the systems and hosts described herein can communicate.
- Network cloud 60 may include packet-based wide area networks (such as the Internet), private networks, wireless networks, satellite networks, cellular networks, paging networks, and the like.
- FIG. 1 illustrates, a particular implementation of the invention can operate in a network environment comprising social network system 20 and one or more client devices 30 .
- Client devices 30 are operably connected to the network environment via a network service provider, a wireless carrier, or any other suitable means.
- the social network system 20 comprises computing systems that allow users to communicate or otherwise interact with each other and access content, such as user profiles, as described herein.
- Social network system 20 is a network addressable system that, in one implementation, comprises one or more physical servers 22 and data store 24 .
- the one or more physical servers 22 are operably connected to computer network 60 via a router 26 .
- the functionality hosted by the one or more physical servers 22 may include web or HTTP servers, FTP servers, and the like.
- Physical servers 22 host functionality directed to the operations of a social network.
- social network system 20 may host a website that allows one or more users, at one or more client devices 30 , to communicate with one another via the website.
- Content data store 24 stores content and data relating to, and enabling, operation of the social network as digital data objects.
- a data object in particular implementations, is an item of digital information typically stored or embodied in a data file, database or record.
- Content objects may take many forms, including: text (e.g., ASCII, SGML, HTML), images (e.g., jpeg, tif and gif), graphics (vector-based or bitmap), audio, video (e.g., mpeg), or other multimedia, and combinations thereof.
- Content object data may also include executable code objects (e.g., games executable within a browser window or frame), podcasts, etc.
- content data store 24 corresponds to a variety of separate and integrated databases, such as relational databases and object-oriented databases, that maintain information as an integrated collection of logically related records or files stored on one or more physical systems.
- content data store 24 connotes a large class of data storage and management systems.
- content data store 24 may be implemented by any suitable physical system including components, such as database servers, mass storage media, media library systems, storage area networks, data storage clouds, and the like.
- Content data store 24 includes data associated with different social network system 20 users.
- the social network system 20 maintains a user profile for each user of the website 20 .
- User profiles include data that describe the users of a social network, including proper names (first, middle and last of a person, a tradename or company name of a business entity, etc.) biographic, demographic, and other types of descriptive information, such as work experience, educational history, hobbies or preferences, location, and additional descriptive data.
- user profiles may include a user's birthday, relationship status, city of residence, and the like.
- the website 20 further stores data describing one or more relationships between different users. The relationship information may indicate users who have similar or common work experience, group memberships, hobbies, or educational history.
- a user profile may also include affinity information for another user based on relationships with other users and a user's implicit and explicit interaction with content on the site (reading stories headlines, frequency of accessing content, feedback from other users, profiles, etc).
- a user profile may also include privacy settings indicating how accessible is to other users is any of the information in the user profile, user contact information or user-defined relationships with other users, such as the user's friends, networks, groups, or the like.
- Client device 30 is a computer or computing device including functionality for communicating over a computer network.
- a client node can be a desktop computer, laptop computer, as well as mobile devices (including cellular telephones, personal digital assistants, and mobile gaming devices).
- a client device 30 may execute one or more client applications, such as a web browser, to access and view content over a computer network.
- client applications allow users to enter addresses of specific network resources, such as resources hosted by social network system 20 , to be retrieved. These addresses can be Uniform Resource Locators, or URLs.
- the client applications may provide access to other pages or records when the user “clicks” on hyperlinks to other resources.
- Such hyperlinks are located within the web pages and provide an automated way for the user to enter the URL of another page and to retrieve that page.
- the pages or resources can be data records including as content plain textual information, or more complex digitally encoded multimedia content, such as software programs or other code objects, graphics, images, audio signals, videos, and so forth.
- the social network system 20 maintains in content data store 24 a number of objects for the different kinds of items with which a user may interact on the website 100 .
- these objects include user profiles, application objects, and message objects (such as for wall posts, invitations, notifications, new feeds, emails and other messages).
- message objects such as for wall posts, invitations, notifications, new feeds, emails and other messages.
- an object is stored by the website 20 for each instance of its associated item.
- social network system 20 When a user creates a message directed to another user, social network system 20 generates a message object that includes a plurality of attributes. Common to most message channel types (such as wall posts, invitations, notifications, news feeds and electronic mail) are attributes such as identity of the sending user, identity of the target user, a text string embodying the message, the data and time the message was sent, and the like. As discussed below, a nickname generating process, executing periodically, can access content data store 24 to analyze the messages and learn associations between user names (first and/or last names) and nicknames possibly contained in the messages.
- attributes such as identity of the sending user, identity of the target user, a text string embodying the message, the data and time the message was sent, and the like.
- a nickname generating process executing periodically, can access content data store 24 to analyze the messages and learn associations between user names (first and/or last names) and nicknames possibly contained in the messages.
- This nickname generating process can search for messages between users based on one or more criteria defining a message type.
- message channels e.g., wall posts, notifications, invitations, news feed items, electronic mail, short message service, etc.
- messages can be classified into a variety of different message types, each defined by a set of attributes or matching rules.
- a message type may correspond to “birthday wall posts” defined as wall posts that are directed to a user of social network system 20 on that user's birthday.
- various database queries can be composed that identify all messages of the wall post channel type that were sent to a target user on the birthday included in the user's profile.
- trigger words in the message string itself such as “happy birthday,” “happy B-day,” etc.
- the nickname generating process may search for email messages directed to users that occur on the recipient user's birthday and/or include certain trigger words in addition to or in lieu of wall posts.
- Other embodiments might search logs of instant messaging communications between users, Short Message Service (SMS) messages, and the like.
- SMS Short Message Service
- other message types can be defined based on a variety of data attributes maintained in content data store 24 .
- other message types that exhibit a high density of name use in the message body can be defined, such as wall posts or emails on a user's wedding anniversary, the first N wall posts when a user initially registers with the website 20 , the first N wall posts after a user posts a significant life event (such as the birth of a child, a new job, and the like).
- Additional filters can also be employed, such as limiting consideration of messages where there have been a threshold number of communications between sending and receiving users to ensure that only messages between users having stronger social connections are considered.
- the nickname generating process may also filter messages based on geographic region and/or language version (e.g., German, French, Spanish, etc.) to generate nickname dictionaries tailored to a given region and/or language.
- FIG. 2 illustrates an example process, according to one possible implementation of the invention, directed to creating a nickname dictionary.
- the nickname generating process initializes temporary and permanent data structures involved in the process, such as a nickname table ( 302 ).
- the nickname table has one or more entries, each entry identifying a name, a nickname and a confidence score.
- Table 1 illustrates a segment of a nickname dictionary table according to one implementation of the invention, including possible nicknames for the name “Jonathan.”
- the nickname generating process accesses content data store 24 to create a message table based on select messages stored therein ( 304 ).
- the nickname generating process can be configured to search for all wall posts within a given time period that are directed to a target user on that user's birthday.
- the resulting message table in one implementation, is a data structure that includes the text of all messages that match the selection criteria grouped by the first names and last names of the target users.
- the nickname generating process may normalize the first and last names contained in user profiles, such as removing extraneous capitalizations or punctuations.
- the nickname generating process may also filter out stop words from the message strings, such as articles (“the”, “a”, and “an”), profanity, and other words (such as birthday, happy, and the like) known not to be nicknames.
- stop words such as articles (“the”, “a”, and “an”), profanity, and other words (such as birthday, happy, and the like) known not to be nicknames.
- the consideration of confidence factors, as well as empirical analysis after generating the nickname dictionary can be used to filter out words that do not have a threshold confidence score relative to a given name.
- the nickname generating process then generates one or more statistical attributes for name-word pairs based on analysis of the message table.
- the nickname generating process may generate a counts table as described more fully below.
- the counts table may include: 1) PAIR—the number of occurrences of a given word in a message to users having a given first or last name (for example, PAIR would yield a number of occurrences of the word “dude” in messages to users named “Gideon”); 2) WORD—the global number of occurrences of a given word in the data set; 3) NAME—the global number of total words in messages directed to users of a given first name; and 4) “NAME_BY_WORD”—a count, for a given word under consideration, of the total number of words in communicated messages to users having first or last names, where the word (under consideration) appears in at least one message to at least one user having such a first or last name.
- the nickname generating process then applies the counts table to a statistical algorithm to generate confidence scores for the name-word associations and maps the resulting data to the nickname table ( 308 ).
- each entry of the counts table in concept, defines the entries for a 2 ⁇ 2 contingency table for a given name-word pair, including the entries defined by Table 2, below.
- PAIR indicates the number of occurrences of the name-word pair in the message table
- NAME-PAIR is the number of occurrences of all words other than the word under consideration (non-word) in all messages directed to a user having the name under in the word-name pair
- WORD-PAIR indicates the global number of occurrences of the candidate word in connection with all other names (non-name)
- NAME_BY_WORD ⁇ (PAIR+WORD) are the number of occurrences of all other names (non-names) with all other words (non-words) limited to the NAME_BY_WORD space.
- the following APACHE HIVE code segment provides an illustrative example of generating the counts table from other preliminary tables or data structures, including fbinv_pair_counts, a table grouped by name of the PAIR variable, and fbinv_name_counts, a table grouped by name of the NAME variable.
- the APACHE HIVE code segment also demonstrates application of a statistical algorithm to the counts table, such as Fisher's Exact Test.
- Another test involves threshold comparisons of different count values. For example, if at least X percent of occurrences of a given word are on the walls with people of a given name (PAIR/WORD), and this word accounts for at least 1 in 10,000 of all words of all the words on the walls of people with that name (PAIR/NAME>1/10000), then nickname generating process considers the name and word to be sufficiently associated such that the word can be considered to be a nickname for the name.
- the end result of the nickname generating process is, in one implementation, a nickname dictionary that includes a plurality of entries where each entry identifies a name, a nickname and a confidence score indicating a degree of confidence in the nickname.
- the process described above can be repeated over time to further refine the nickname dictionary as more data becomes available.
- the process can be repeated using a sliding analysis window (such as the last year or some other interval) to adjust for possible shifts in nickname usage or other developments over time.
- a variety of functions can leverage the resulting nickname dictionary, such as search engines for locating entities or user profiles (such as search query suggestions, search query expansion, result ranking), and registration processes (such as username suggestions and data field seeding).
- Fisher's exact test or other statistical algorithm can be implemented as a PYTHON script, which as shown above can be called using a TRANSFORM clause.
- Other development platforms that can leverage APACHE HADOOP or other Map-Reduce execution engines can be used as well.
- FIG. 1 illustrates an example distributed computing system, consisting of one master server 22 a and two slave servers 22 b .
- the distributed computing system comprises a high-availability cluster of commodity servers in which the slave servers are typically called nodes. Though only two nodes are shown in FIG. 1 , the number of nodes might well exceed a hundred, or even a thousand, in some embodiments. Ordinarily, nodes in a high-availability cluster are redundant, so that if one node crashes while performing a particular application, the cluster software can restart the application on one or more other nodes.
- a master server such as 22 a receives a job from a client and then assigns tasks resulting from that job to slave servers or nodes, such as servers 22 b , which do the actual work of executing the assigned tasks upon instruction from the master and which move data between tasks.
- the client jobs will invoke HADOOP's MapReduce functionality, as discussed above.
- a master server such as server 22 a
- the master server 22 a governs a distributed file system that supports parallel processing of large databases.
- the master server 22 a manages the file system's namespace and block mapping to nodes, as well as client access to files, which are actually stored on slave servers or nodes, such as servers 22 b .
- the slave servers do the actual work of executing read and write requests from clients and perform block creation, deletion, and replication upon instruction from the master server.
- FIG. 3 illustrates an example computing system architecture, which may be used to implement a server 22 a , 22 b .
- hardware system 200 comprises a processor 202 , a cache memory 204 , and one or more executable modules and drivers, stored on a computer readable medium, directed to the functions described herein.
- hardware system 200 includes a high performance input/output (I/O) bus 206 and a standard I/O bus 208 .
- a host bridge 210 couples processor 202 to high performance I/O bus 206
- I/O bus bridge 212 couples the two buses 206 and 208 to each other.
- a system memory 214 and one or more network/communication interfaces 216 couple to bus 206 .
- network interface 216 provides communication between hardware system 200 and any of a wide range of networks, such as an Ethernet (e.g., IEEE 802.3) network, a backplane, etc.
- Mass storage 218 provides permanent storage for the data and programming instructions to perform the above-described functions implemented in the servers 22 a , 22 b
- system memory 214 e.g., DRAM
- I/O ports 220 are one or more serial and/or parallel communication ports that provide communication between additional peripheral devices, which may be coupled to hardware system 200 .
- the operations of the nickname generating process described herein are implemented as a series of executable modules run by hardware system 200 , individually or collectively in a distributed computing environment.
- a set of software modules and/or drivers implements a network communications protocol stack, parallel computing functions, nickname generating processes, and the like.
- the foregoing functional modules may be realized by hardware, executable modules stored on a computer readable medium, or a combination of both.
- the functional modules may comprise a plurality or series of instructions to be executed by a processor in a hardware system, such as processor 202 . Initially, the series of instructions may be stored on a storage device, such as mass storage 218 .
- the series of instructions can be stored on any suitable storage medium, such as a diskette, CD-ROM, ROM, EEPROM, etc.
- the series of instructions need not be stored locally, and could be received from a remote storage device, such as a server on a network, via network/communications interface 216 .
- the instructions are copied from the storage device, such as mass storage 218 , into memory 214 and then accessed and executed by processor 202 .
- An operating system manages and controls the operation of hardware system 200 , including the input and output of data to and from software applications (not shown).
- the operating system provides an interface between the software applications being executed on the system and the hardware components of the system.
- Any suitable operating system may be used, such as the LINUX Operating System, the APPLE MACINTOSH Operating System, available from Apple Inc. of Cupertino, Calif., UNIX operating systems, MICROSOFT® WINDOWS® operating systems, BSD operating systems, and the like.
- the nickname generating functions described herein may be implemented in firmware or on an application specific integrated circuit.
- the above-described elements and operations can be comprised of instructions that are stored on storage media.
- the instructions can be retrieved and executed by a processing system.
- Some examples of instructions are software, program code, and firmware.
- Some examples of storage media are memory devices, tape, disks, integrated circuits, and servers.
- the instructions are operational when executed by the processing system to direct the processing system to operate in accord with the invention.
- processing system refers to a single processing device or a group of inter-operational processing devices. Some examples of processing devices are integrated circuits and logic circuitry. Those skilled in the art are familiar with instructions, computers, and storage media.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Health & Medical Sciences (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Methods, apparatuses and systems for generating a name-word dictionary that includes associations between names of users and candidate words (e.g., nicknames) based on statistical analysis of user communications observed at a network communications facility, such as a social network system, an email provider and the like.
Description
- This application is a continuation under 35 U.S.C. §120 of U.S. patent application Ser. No. 12/623,311, filed 20 Nov. 2009, which is incorporated herein by reference.
- The present disclosure generally relates to data mining and, more particularly, to mining of information in user communications to develop a nickname dictionary.
- Social networks, or social utilities that enable connections between users (including people, businesses, and other entities) have become prevalent in recent years. In particular, social network systems allow users to communicate information very efficiently. For example, a user may post contact information, background information, job information, hobbies, and/or other user-specific data to a location associated with the user on a social network system. Other users can then review the posted data by browsing user profiles or searching for profiles including specific data. Users may also post messages directly on user profiles and send messages to a private inbox. The social network systems also allow users to associate themselves with other users, thus creating a web of connections among the users of the social network system.
- Searching for users of a social network system, or more generally for information regarding an individual or other entity, typically involves composing a search query including forenames and surnames and submitting it to a search system. Often, however, users adopt nicknames as user names, which can be problematic for user searches when a searching user only has forename and surname information. Alternatively, some searchers encounter the opposite problem, failing to find a user by a common nickname because the account is under the full forename. In addition, nicknames that a given individual chooses to adopt may change over time, presenting similar problems even when a searching user remembers a nickname.
- The present invention is directed to methods, apparatuses and systems directed to generating a nickname dictionary that includes associations between names of users and nicknames based on statistical analysis of user communications observed at a network communications facility, such as a social network system, an email provider and the like. In one implementation, a nickname generating process analyzes user communications to develop a nickname dictionary that includes a plurality of entries, each entry identifying a name, a nickname and a confidence score indicating a degree of confidence in the nickname. In this manner, implementations of the invention leverage communications between users to learn nicknames. A variety of functions can leverage the resulting nickname dictionary, such as search engines for locating users or user profiles (such as search query suggestions, search query expansion, result ranking), and registration processes (such as username suggestions and data field seeding).
-
FIG. 1 is a schematic diagram of a computer network environment, in which particular embodiments of the present invention may operate. -
FIG. 2 is a flow chart setting forth an example process according to one implementation of the invention. -
FIG. 3 is a functional block diagram illustrating an example network device hardware system architecture. - A social network system offers its users the ability to communicate and interact with other users of the website. In some implementations, users join the social network system and then add connections to a number of other users to whom they desire to be connected. As used herein, the term “friend” refers to any other user to whom a user has formed a connection, association, or relationship via the website. Connections may be added explicitly by a user—if for example, the user selects another user as a friend—or automatically created by the social network system based on common characteristics of the users (e.g., users who are alumni of the same educational institution). Connections in social network systems are usually in both directions, but need not be, so the terms “user” and “friend” depend on the frame of reference. For example, if Bob and Joe are both users and connected to each other in the website, Bob and Joe, both users, are also each other's friends. The connection between users may be a direct connection; however, some embodiments of a social network system allow the connection to be indirect via one or more levels of connections. Also, the term friend need not require that users actually be friends in real life (which would generally be the case when one of the users is a business or other entity); it simply implies a connection in the social network system. In particular implementations, a user of the social network system may be any suitable entity, such as an individual, a corporation, a partnership, a joint venture, and combinations of the foregoing.
- The
social network system 20 maintains one or more network communications facilities that provide users with the ability to communicate with other users. Some types of actions include “friend requesting,” “wall posting,” and “sending a message.” Upon acceptance of a friend request, the requestor and requestee become friends. Friends may access more information about each other's profile than other non-friend users. A wall post allows users to post a message to a target user's wall. The wall is a forum for comments or insights about another user or a given topic and typically appears on a user's profile page. Typically, a first user can create a wall post on a target user's wall, to which other users, including the target user, may add messages to form a message thread. The social network system may also maintain some form of private message communications facility, such as intra- and inter-domain electronic mail, that users access by navigating to a private inbox. - Particular implementations of the invention operate in a wide area network environment, such as the Internet, including multiple network addressable systems.
FIG. 1 illustrates an example network environment, in which embodiments of the invention may operate.Network cloud 60 generally represents one or more interconnected networks, over which the systems and hosts described herein can communicate.Network cloud 60 may include packet-based wide area networks (such as the Internet), private networks, wireless networks, satellite networks, cellular networks, paging networks, and the like. AsFIG. 1 illustrates, a particular implementation of the invention can operate in a network environment comprisingsocial network system 20 and one ormore client devices 30.Client devices 30 are operably connected to the network environment via a network service provider, a wireless carrier, or any other suitable means. - The
social network system 20 comprises computing systems that allow users to communicate or otherwise interact with each other and access content, such as user profiles, as described herein.Social network system 20 is a network addressable system that, in one implementation, comprises one or more physical servers 22 anddata store 24. The one or more physical servers 22 are operably connected tocomputer network 60 via arouter 26. In one implementation, the functionality hosted by the one or more physical servers 22 may include web or HTTP servers, FTP servers, and the like. - Physical servers 22 host functionality directed to the operations of a social network. For example,
social network system 20 may host a website that allows one or more users, at one ormore client devices 30, to communicate with one another via the website.Content data store 24 stores content and data relating to, and enabling, operation of the social network as digital data objects. A data object, in particular implementations, is an item of digital information typically stored or embodied in a data file, database or record. Content objects may take many forms, including: text (e.g., ASCII, SGML, HTML), images (e.g., jpeg, tif and gif), graphics (vector-based or bitmap), audio, video (e.g., mpeg), or other multimedia, and combinations thereof. Content object data may also include executable code objects (e.g., games executable within a browser window or frame), podcasts, etc. Logically,content data store 24 corresponds to a variety of separate and integrated databases, such as relational databases and object-oriented databases, that maintain information as an integrated collection of logically related records or files stored on one or more physical systems. Structurally,content data store 24 connotes a large class of data storage and management systems. In particular implementations,content data store 24 may be implemented by any suitable physical system including components, such as database servers, mass storage media, media library systems, storage area networks, data storage clouds, and the like. -
Content data store 24 includes data associated with differentsocial network system 20 users. Thesocial network system 20 maintains a user profile for each user of thewebsite 20. User profiles include data that describe the users of a social network, including proper names (first, middle and last of a person, a tradename or company name of a business entity, etc.) biographic, demographic, and other types of descriptive information, such as work experience, educational history, hobbies or preferences, location, and additional descriptive data. For example, user profiles may include a user's birthday, relationship status, city of residence, and the like. Thewebsite 20 further stores data describing one or more relationships between different users. The relationship information may indicate users who have similar or common work experience, group memberships, hobbies, or educational history. A user profile may also include affinity information for another user based on relationships with other users and a user's implicit and explicit interaction with content on the site (reading stories headlines, frequency of accessing content, feedback from other users, profiles, etc). A user profile may also include privacy settings indicating how accessible is to other users is any of the information in the user profile, user contact information or user-defined relationships with other users, such as the user's friends, networks, groups, or the like. -
Client device 30 is a computer or computing device including functionality for communicating over a computer network. A client node can be a desktop computer, laptop computer, as well as mobile devices (including cellular telephones, personal digital assistants, and mobile gaming devices). Aclient device 30 may execute one or more client applications, such as a web browser, to access and view content over a computer network. In particular implementations, the client applications allow users to enter addresses of specific network resources, such as resources hosted bysocial network system 20, to be retrieved. These addresses can be Uniform Resource Locators, or URLs. In addition, once a page or other resource has been retrieved, the client applications may provide access to other pages or records when the user “clicks” on hyperlinks to other resources. In some implementations, such hyperlinks are located within the web pages and provide an automated way for the user to enter the URL of another page and to retrieve that page. The pages or resources can be data records including as content plain textual information, or more complex digitally encoded multimedia content, such as software programs or other code objects, graphics, images, audio signals, videos, and so forth. - The
social network system 20 maintains in content data store 24 a number of objects for the different kinds of items with which a user may interact on the website 100. In one example embodiment, these objects include user profiles, application objects, and message objects (such as for wall posts, invitations, notifications, new feeds, emails and other messages). In one embodiment, an object is stored by thewebsite 20 for each instance of its associated item. These objects and the actions discussed herein are provided for illustration purposes only, and it can be appreciated that an unlimited number of variations and features can be provided on asocial network system 20. - When a user creates a message directed to another user,
social network system 20 generates a message object that includes a plurality of attributes. Common to most message channel types (such as wall posts, invitations, notifications, news feeds and electronic mail) are attributes such as identity of the sending user, identity of the target user, a text string embodying the message, the data and time the message was sent, and the like. As discussed below, a nickname generating process, executing periodically, can accesscontent data store 24 to analyze the messages and learn associations between user names (first and/or last names) and nicknames possibly contained in the messages. - This nickname generating process can search for messages between users based on one or more criteria defining a message type. Beyond message channels (e.g., wall posts, notifications, invitations, news feed items, electronic mail, short message service, etc.), messages can be classified into a variety of different message types, each defined by a set of attributes or matching rules. For example, a message type may correspond to “birthday wall posts” defined as wall posts that are directed to a user of
social network system 20 on that user's birthday. To locate messages of this type, various database queries can be composed that identify all messages of the wall post channel type that were sent to a target user on the birthday included in the user's profile. In some embodiments, trigger words in the message string itself, such as “happy birthday,” “happy B-day,” etc. can be used in addition to or in lieu of matching to a target user's birthday. In other implementations, the nickname generating process may search for email messages directed to users that occur on the recipient user's birthday and/or include certain trigger words in addition to or in lieu of wall posts. Other embodiments might search logs of instant messaging communications between users, Short Message Service (SMS) messages, and the like. As one skilled in the art will recognize, other message types can be defined based on a variety of data attributes maintained incontent data store 24. For example, other message types that exhibit a high density of name use in the message body can be defined, such as wall posts or emails on a user's wedding anniversary, the first N wall posts when a user initially registers with thewebsite 20, the first N wall posts after a user posts a significant life event (such as the birth of a child, a new job, and the like). Additional filters can also be employed, such as limiting consideration of messages where there have been a threshold number of communications between sending and receiving users to ensure that only messages between users having stronger social connections are considered. Still further, the nickname generating process may also filter messages based on geographic region and/or language version (e.g., German, French, Spanish, etc.) to generate nickname dictionaries tailored to a given region and/or language. -
FIG. 2 illustrates an example process, according to one possible implementation of the invention, directed to creating a nickname dictionary. As a preliminary step, the nickname generating process initializes temporary and permanent data structures involved in the process, such as a nickname table (302). In one implementation, the nickname table has one or more entries, each entry identifying a name, a nickname and a confidence score. Table 1 illustrates a segment of a nickname dictionary table according to one implementation of the invention, including possible nicknames for the name “Jonathan.” -
TABLE 1 Name Nickname Confidence Score Jonathan Jon 0.7 Jonathan Jack 0.64 Jonathan Jonny 0.8 Jonathan Jonnie 0.5 Jonathan Nathan 0.49 - As
FIG. 2 illustrates, the nickname generating process accessescontent data store 24 to create a message table based on select messages stored therein (304). For example, the nickname generating process can be configured to search for all wall posts within a given time period that are directed to a target user on that user's birthday. The resulting message table, in one implementation, is a data structure that includes the text of all messages that match the selection criteria grouped by the first names and last names of the target users. As optional pre-processing steps, the nickname generating process may normalize the first and last names contained in user profiles, such as removing extraneous capitalizations or punctuations. The nickname generating process may also filter out stop words from the message strings, such as articles (“the”, “a”, and “an”), profanity, and other words (such as birthday, happy, and the like) known not to be nicknames. However, in some implementations, the consideration of confidence factors, as well as empirical analysis after generating the nickname dictionary, can be used to filter out words that do not have a threshold confidence score relative to a given name. - The nickname generating process then generates one or more statistical attributes for name-word pairs based on analysis of the message table. For example, the nickname generating process, in one implementation, may generate a counts table as described more fully below. For example, for a given name-word pair, the counts table may include: 1) PAIR—the number of occurrences of a given word in a message to users having a given first or last name (for example, PAIR would yield a number of occurrences of the word “dude” in messages to users named “Gideon”); 2) WORD—the global number of occurrences of a given word in the data set; 3) NAME—the global number of total words in messages directed to users of a given first name; and 4) “NAME_BY_WORD”—a count, for a given word under consideration, of the total number of words in communicated messages to users having first or last names, where the word (under consideration) appears in at least one message to at least one user having such a first or last name. NAME_BY_WORD, in one implementation, defines the word space, narrowing it to the name situations where the name-word association could occur. To compute NAME_BY_WORD, the nickname generating process may, for a given word, find every name where there is at least one occurrence of that word in a message, and then obtain a count the total number of words in messages associated with those names. Conceptually, it is the space of words related/connected by names to the word in the name-word pair.
- The nickname generating process then applies the counts table to a statistical algorithm to generate confidence scores for the name-word associations and maps the resulting data to the nickname table (308). In a particular implementation, each entry of the counts table, in concept, defines the entries for a 2×2 contingency table for a given name-word pair, including the entries defined by Table 2, below.
-
TABLE 2 Contingency Table Word − Name Pair Name Non-Name Word PAIR WORD − PAIR Non-Word NAME − PAIR NAME_BY_WORD − (PAIR + WORD) - As Table 2 illustrates and as discussed above, PAIR indicates the number of occurrences of the name-word pair in the message table, while NAME-PAIR is the number of occurrences of all words other than the word under consideration (non-word) in all messages directed to a user having the name under in the word-name pair. WORD-PAIR indicates the global number of occurrences of the candidate word in connection with all other names (non-name), while NAME_BY_WORD−(PAIR+WORD) are the number of occurrences of all other names (non-names) with all other words (non-words) limited to the NAME_BY_WORD space.
- The following APACHE HIVE code segment provides an illustrative example of generating the counts table from other preliminary tables or data structures, including fbinv_pair_counts, a table grouped by name of the PAIR variable, and fbinv_name_counts, a table grouped by name of the NAME variable. The APACHE HIVE code segment also demonstrates application of a statistical algorithm to the counts table, such as Fisher's Exact Test.
-
CREATE TABLE IF NOT EXISTS fbinv_nicknames(firstname STRING, word STRING, certainty DOUBLE); ALTER TABLE fbinv_nicknames SET TBLPROPERTIES(‘RETENTION’=‘30’); INSERT OVERWRITE TABLE fbinv_nicknames SELECT TRANSFORM(map_output.firstname, map_output.word, map_output.certainty) USING ‘python /home/fbinv/fisher_reducer.py’ AS firstname, word, certainty FROM ( SELECT TRANSFORM(step_two.firstname, step_two.word, step_two.pair_freq, step_two.name_freq, step_two.freq, d.freq) USING ‘python /home/fbinv/fisher_mapper.py’ AS firstname, word, certainty FROM ( SELECT step_one.firstname, step_one.word, step_one.pair_freq, step_one.name_freq, c.freq FROM ( SELECT a.firstname AS firstname, a.word AS word, a.freq AS pair_freq, b.freq AS name_freq FROM fbinv_pair_counts a JOIN fbinv_name_counts b ON a.firstname = b.firstname) step_one JOIN fbinv_name_word_counts c ON step_one.word = c.word) step_two JOIN fbinv_word_counts d ON step_two.word = d.word CLUSTER BY firstname) map_output; INSERT OVERWRITE LOCAL DIRECTORY ‘/home/fbinv/fisher_nicks’ SELECT fbinv_nicknames.firstname, fbinv_nicknames.word, fbinv_nicknames.certainty FROM fbinv_nicknames; - As discussed above, a particular implementation of the invention applies Fisher's Exact Test (or Fisher-Irwin test), which is a statistical significance test that can be used in the analysis of contingency tables. In some implementations, Fisher's exact test can be used to determine the significance of the association between a given name and a give word in the messages table. The resulting P-value from Fisher's Exact Test can be used as a confidence score. In general, a P-value less than or equal to 0.05 is considered sufficient to establish a statistically significant association between a name and a word. In other implementations, alternative tests can be used to test for association confidence, such as Pearson's chi-square test, a G-test or Barnard's test.
- In other embodiments, alternative tests can be used in addition to, or in lieu of, statistical tests. In one implementation, another test involves threshold comparisons of different count values. For example, if at least X percent of occurrences of a given word are on the walls with people of a given name (PAIR/WORD), and this word accounts for at least 1 in 10,000 of all words of all the words on the walls of people with that name (PAIR/NAME>1/10000), then nickname generating process considers the name and word to be sufficiently associated such that the word can be considered to be a nickname for the name.
- As discussed above, the end result of the nickname generating process is, in one implementation, a nickname dictionary that includes a plurality of entries where each entry identifies a name, a nickname and a confidence score indicating a degree of confidence in the nickname. The process described above can be repeated over time to further refine the nickname dictionary as more data becomes available. In some implementations, the process can be repeated using a sliding analysis window (such as the last year or some other interval) to adjust for possible shifts in nickname usage or other developments over time. A variety of functions can leverage the resulting nickname dictionary, such as search engines for locating entities or user profiles (such as search query suggestions, search query expansion, result ranking), and registration processes (such as username suggestions and data field seeding).
- As described herein, the nickname-generating process can be implemented as a series of computer-readable instructions, embodied on a data storage medium, that when executed are operable to cause one or more processors to implement the operations described above. For smaller datasets, the operations described above can be executed on a single computing platform or node. For larger systems and resulting data sets, parallel computing platforms can be used. For example, the operations discussed above can be implemented using APACHE HIVE to accomplish ad hoc querying, summarization and data analysis, as well as using as incorporating statistical modules by embedding mapper and reducer scripts, such as PYTHON or PERL scripts that implement a statistical algorithm. For example, Fisher's exact test or other statistical algorithm can be implemented as a PYTHON script, which as shown above can be called using a TRANSFORM clause. Other development platforms that can leverage APACHE HADOOP or other Map-Reduce execution engines can be used as well.
- The Apache Software Foundation has developed a collection of programs called HADOOP (named after a toddler's stuffed elephant), which includes: (a) a distributed file system; and (b) an application programming interface (API) and corresponding implementation of MapReduce.
FIG. 1 illustrates an example distributed computing system, consisting of onemaster server 22 a and twoslave servers 22 b. In some embodiments of the present invention, the distributed computing system comprises a high-availability cluster of commodity servers in which the slave servers are typically called nodes. Though only two nodes are shown inFIG. 1 , the number of nodes might well exceed a hundred, or even a thousand, in some embodiments. Ordinarily, nodes in a high-availability cluster are redundant, so that if one node crashes while performing a particular application, the cluster software can restart the application on one or more other nodes. - Multiple nodes also facilitate the parallel processing of large databases. In some embodiments of the present invention, a master server, such as 22 a, receives a job from a client and then assigns tasks resulting from that job to slave servers or nodes, such as
servers 22 b, which do the actual work of executing the assigned tasks upon instruction from the master and which move data between tasks. In some embodiments, the client jobs will invoke HADOOP's MapReduce functionality, as discussed above. - Likewise, in some embodiments of the present invention, a master server, such as
server 22 a, governs a distributed file system that supports parallel processing of large databases. In particular, themaster server 22 a manages the file system's namespace and block mapping to nodes, as well as client access to files, which are actually stored on slave servers or nodes, such asservers 22 b. In turn, in some embodiments, the slave servers do the actual work of executing read and write requests from clients and perform block creation, deletion, and replication upon instruction from the master server. - While the foregoing processes and mechanisms can be implemented by a wide variety of physical systems and in a wide variety of network and computing environments, the server or computing systems described below provide example computing system architectures for didactic, rather than limiting, purposes.
-
FIG. 3 illustrates an example computing system architecture, which may be used to implement a 22 a, 22 b. In one embodiment,server hardware system 200 comprises aprocessor 202, acache memory 204, and one or more executable modules and drivers, stored on a computer readable medium, directed to the functions described herein. Additionally,hardware system 200 includes a high performance input/output (I/O)bus 206 and a standard I/O bus 208. Ahost bridge 210couples processor 202 to high performance I/O bus 206, whereas I/O bus bridge 212 couples the two 206 and 208 to each other. Abuses system memory 214 and one or more network/communication interfaces 216 couple tobus 206.Hardware system 200 may further include video memory (not shown) and a display device coupled to the video memory.Mass storage 218, and I/O ports 220 couple tobus 208.Hardware system 200 may optionally include a keyboard and pointing device, and a display device (not shown) coupled tobus 208. Collectively, these elements are intended to represent a broad category of computer hardware systems, including but not limited to general purpose computer systems based on the x86-compatible processors manufactured by Intel Corporation of Santa Clara, Calif., and the x86-compatible processors manufactured by Advanced Micro Devices (AMD), Inc., of Sunnyvale, Calif., as well as any other suitable processor. - The elements of
hardware system 200 are described in greater detail below. In particular,network interface 216 provides communication betweenhardware system 200 and any of a wide range of networks, such as an Ethernet (e.g., IEEE 802.3) network, a backplane, etc.Mass storage 218 provides permanent storage for the data and programming instructions to perform the above-described functions implemented in the 22 a, 22 b, whereas system memory 214 (e.g., DRAM) provides temporary storage for the data and programming instructions when executed byservers processor 202. I/O ports 220 are one or more serial and/or parallel communication ports that provide communication between additional peripheral devices, which may be coupled tohardware system 200. -
Hardware system 200 may include a variety of system architectures; and various components ofhardware system 200 may be rearranged. For example,cache 204 may be on-chip withprocessor 202. Alternatively,cache 204 andprocessor 202 may be packed together as a “processor module,” withprocessor 202 being referred to as the “processor core.” Furthermore, certain embodiments of the present invention may not require nor include all of the above components. For example, the peripheral devices shown coupled to standard I/O bus 208 may couple to high performance I/O bus 206. In addition, in some embodiments, only a single bus may exist, with the components ofhardware system 200 being coupled to the single bus. Furthermore,hardware system 200 may include additional components, such as additional processors, storage devices, or memories. - In one implementation, the operations of the nickname generating process described herein are implemented as a series of executable modules run by
hardware system 200, individually or collectively in a distributed computing environment. In a particular embodiment, a set of software modules and/or drivers implements a network communications protocol stack, parallel computing functions, nickname generating processes, and the like. The foregoing functional modules may be realized by hardware, executable modules stored on a computer readable medium, or a combination of both. For example, the functional modules may comprise a plurality or series of instructions to be executed by a processor in a hardware system, such asprocessor 202. Initially, the series of instructions may be stored on a storage device, such asmass storage 218. However, the series of instructions can be stored on any suitable storage medium, such as a diskette, CD-ROM, ROM, EEPROM, etc. Furthermore, the series of instructions need not be stored locally, and could be received from a remote storage device, such as a server on a network, via network/communications interface 216. The instructions are copied from the storage device, such asmass storage 218, intomemory 214 and then accessed and executed byprocessor 202. - An operating system manages and controls the operation of
hardware system 200, including the input and output of data to and from software applications (not shown). The operating system provides an interface between the software applications being executed on the system and the hardware components of the system. Any suitable operating system may be used, such as the LINUX Operating System, the APPLE MACINTOSH Operating System, available from Apple Inc. of Cupertino, Calif., UNIX operating systems, MICROSOFT® WINDOWS® operating systems, BSD operating systems, and the like. Of course, other implementations are possible. For example, the nickname generating functions described herein may be implemented in firmware or on an application specific integrated circuit. - Furthermore, the above-described elements and operations can be comprised of instructions that are stored on storage media. The instructions can be retrieved and executed by a processing system. Some examples of instructions are software, program code, and firmware. Some examples of storage media are memory devices, tape, disks, integrated circuits, and servers. The instructions are operational when executed by the processing system to direct the processing system to operate in accord with the invention. The term “processing system” refers to a single processing device or a group of inter-operational processing devices. Some examples of processing devices are integrated circuits and logic circuitry. Those skilled in the art are familiar with instructions, computers, and storage media.
- The present invention has been explained with reference to specific embodiments. For example, while embodiments of the present invention have been described as operating in connection with a social network system, the present invention can be used in connection with any communications facility that allows for communication of messages between users, such as an email hosting site. In addition, while some embodiments have been described as analyzing wall posts, other message channel types, such as email, can also be considered in addition to, or in lieu of, wall posts. Still further, the nickname generating process described above can be made accessible to external systems via a set of application programming interfaces. Other embodiments will be evident to those of ordinary skill in the art. It is therefore not intended that the present invention be limited, except as indicated by the appended claims.
Claims (20)
1. A method comprising:
accessing, by one or more computing devices, one or more messages, each message being directed to a target user and comprising one or more words, each target user being associated with one or more names;
identifying, by the one or more computing devices, one or more name-word associations between at least one name associated with at least one of the target users and one or more words identified in the messages, the identifying being based on statistics comprising a number of occurrences of a candidate word in messages directed to target users sharing a same name; and
determining, by the one or more computing devices, a confidence score for each of the one or more name-word associations.
2. The method of claim 1 , the identifying being based on analysis of statistical attribute data relevant to associations between the names and the words in the messages.
3. The method of claim 2 , wherein the analysis comprises application of one or more thresholds to the statistical attribute data.
4. The method of claim 2 , wherein the analysis comprises application of the statistical attribute data to a statistical analysis module to identify, for each name-word association, a confidence of the association between the name and the words identified in the messages.
5. The method of claim 4 , wherein the statistical analysis module implements Fisher's exact test.
6. The method of claim 1 , the statistics further comprising a number of occurrences of words other than the candidate word in the messages directed to the target users sharing the same name.
7. The method of claim 1 , wherein the messages are of one message type.
8. The method of claim 7 , wherein the message type corresponds to birthdays of the target users of the respective messages, wall posts on profile pages of a social network system, or electronic mail messages directed to target users associated with a social network system.
9. The method of claim 7 , wherein the message type corresponds to a high density of name use in the messages.
10. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
access, by one or more computing devices, one or more messages, each message being directed to a target user and comprising one or more words, each target user being associated with one or more names;
identify, by the one or more computing devices, one or more name-word associations between at least one name associated with at least one of the target users and one or more words identified in the messages, the identifying being based on statistics comprising a number of occurrences of a candidate word in messages directed to target users sharing a same name; and
determine, by the one or more computing devices, a confidence score for each of the one or more name-word associations.
11. The media of claim 10 , the identifying being based on analysis of statistical attribute data relevant to associations between the names and the words in the messages.
12. The media of claim 11 , wherein the analysis comprises application of one or more thresholds to the statistical attribute data.
13. The media of claim 11 , wherein the analysis comprises application of the statistical attribute data to a statistical analysis module to identify, for each name-word association, a confidence of the association between the name and the words identified in the messages.
14. The media of claim 13 , wherein the statistical analysis module implements Fisher's exact test.
15. A system comprising:
one or more processors; and
a memory coupled to the processors comprising instructions executable by the processors, the processors being operable when executing the instructions to:
access one or more messages, each message being directed to a target user and comprising one or more words, each target user being associated with one or more names;
identify one or more name-word associations between at least one name associated with at least one of the target users and one or more words identified in the messages, the identifying being based on statistics comprising a number of occurrences of a candidate word in messages directed to target users sharing a same name; and
determine a confidence score for each of the one or more name-word associations.
16. The system of claim 15 , the statistics further comprising a number of occurrences of words other than the candidate word in the messages directed to the target users sharing the same name.
17. The system of claim 15 , wherein the messages are of one message type.
18. The system of claim 17 , wherein the message type corresponds to birthdays of the target users of the respective messages, wall posts on profile pages of a social network system, or electronic mail messages directed to target users associated with a social network system.
19. The system of claim 17 , wherein the message type corresponds to a high density of name use in the messages.
20. The system of claim 15 , the identification being based on analysis of statistical attribute data relevant to associations between the names and the words in the messages, wherein one or more thresholds are applied to the statistical attribute data.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/779,574 US20130173611A1 (en) | 2009-11-20 | 2013-02-27 | Generation of nickname dictionary |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/623,311 US8433762B1 (en) | 2009-11-20 | 2009-11-20 | Generation of nickname dictionary based on analysis of user communications |
| US13/779,574 US20130173611A1 (en) | 2009-11-20 | 2013-02-27 | Generation of nickname dictionary |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US12/623,311 Continuation US8433762B1 (en) | 2009-11-20 | 2009-11-20 | Generation of nickname dictionary based on analysis of user communications |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20130173611A1 true US20130173611A1 (en) | 2013-07-04 |
Family
ID=48146161
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US12/623,311 Expired - Fee Related US8433762B1 (en) | 2009-11-20 | 2009-11-20 | Generation of nickname dictionary based on analysis of user communications |
| US13/779,574 Abandoned US20130173611A1 (en) | 2009-11-20 | 2013-02-27 | Generation of nickname dictionary |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US12/623,311 Expired - Fee Related US8433762B1 (en) | 2009-11-20 | 2009-11-20 | Generation of nickname dictionary based on analysis of user communications |
Country Status (1)
| Country | Link |
|---|---|
| US (2) | US8433762B1 (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8756499B1 (en) * | 2013-04-29 | 2014-06-17 | Google Inc. | Gesture keyboard input of non-dictionary character strings using substitute scoring |
| US9047268B2 (en) | 2013-01-31 | 2015-06-02 | Google Inc. | Character and word level language models for out-of-vocabulary text input |
| US9454240B2 (en) | 2013-02-05 | 2016-09-27 | Google Inc. | Gesture keyboard input of non-dictionary character strings |
| JP2018502404A (en) * | 2014-11-04 | 2018-01-25 | 華為技術有限公司Huawei Technologies Co.,Ltd. | Message display method, message display device, and message display device |
| US10180937B2 (en) * | 2017-02-16 | 2019-01-15 | International Business Machines Corporation | Cognitive entity reference recognition |
Families Citing this family (21)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9298783B2 (en) | 2007-07-25 | 2016-03-29 | Yahoo! Inc. | Display of attachment based information within a messaging system |
| US9584343B2 (en) | 2008-01-03 | 2017-02-28 | Yahoo! Inc. | Presentation of organized personal and public data using communication mediums |
| WO2010141216A2 (en) | 2009-06-02 | 2010-12-09 | Xobni Corporation | Self populating address book |
| US9721228B2 (en) | 2009-07-08 | 2017-08-01 | Yahoo! Inc. | Locally hosting a social network using social data stored on a user's computer |
| US8984074B2 (en) | 2009-07-08 | 2015-03-17 | Yahoo! Inc. | Sender-based ranking of person profiles and multi-person automatic suggestions |
| US8990323B2 (en) | 2009-07-08 | 2015-03-24 | Yahoo! Inc. | Defining a social network model implied by communications data |
| US7930430B2 (en) | 2009-07-08 | 2011-04-19 | Xobni Corporation | Systems and methods to provide assistance during address input |
| US9087323B2 (en) | 2009-10-14 | 2015-07-21 | Yahoo! Inc. | Systems and methods to automatically generate a signature block |
| US9020938B2 (en) | 2010-02-03 | 2015-04-28 | Yahoo! Inc. | Providing profile information using servers |
| US9571605B1 (en) * | 2010-04-27 | 2017-02-14 | Amdocs Software Systems Limited | System, method, and computer program for identifying a social network user identifier based on a user message |
| US8620935B2 (en) | 2011-06-24 | 2013-12-31 | Yahoo! Inc. | Personalizing an online service based on data collected for a user of a computing device |
| US8972257B2 (en) | 2010-06-02 | 2015-03-03 | Yahoo! Inc. | Systems and methods to present voice message information to a user of a computing device |
| US10078819B2 (en) | 2011-06-21 | 2018-09-18 | Oath Inc. | Presenting favorite contacts information to a user of a computing device |
| US9747583B2 (en) * | 2011-06-30 | 2017-08-29 | Yahoo Holdings, Inc. | Presenting entity profile information to a user of a computing device |
| US9727924B2 (en) | 2011-10-10 | 2017-08-08 | Salesforce.Com, Inc. | Computer implemented methods and apparatus for informing a user of social network data when the data is relevant to the user |
| US10192200B2 (en) | 2012-12-04 | 2019-01-29 | Oath Inc. | Classifying a portion of user contact data into local contacts |
| GB201320334D0 (en) | 2013-11-18 | 2014-01-01 | Microsoft Corp | Identifying a contact |
| US9619470B2 (en) | 2014-02-04 | 2017-04-11 | Google Inc. | Adaptive music and video recommendations |
| US9965492B1 (en) | 2014-03-12 | 2018-05-08 | Google Llc | Using location aliases |
| CN111131531B (en) * | 2018-11-01 | 2021-06-01 | 腾讯科技(深圳)有限公司 | Method and device for generating nickname in chat group and readable storage medium |
| KR20240167995A (en) * | 2023-05-22 | 2024-11-29 | 라인플러스 주식회사 | Method and system for automatically granting modifiers to users |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6052724A (en) * | 1997-09-02 | 2000-04-18 | Novell Inc | Method and system for managing a directory service |
| US20030084103A1 (en) * | 2001-10-29 | 2003-05-01 | Comverse, Ltd. | Method and system for third-party initiation of an anonymous tele-chat session |
| US20040024760A1 (en) * | 2002-07-31 | 2004-02-05 | Phonetic Research Ltd. | System, method and computer program product for matching textual strings using language-biased normalisation, phonetic representation and correlation functions |
| US20080228735A1 (en) * | 2007-03-16 | 2008-09-18 | Expanse Networks, Inc. | Lifestyle Optimization and Behavior Modification |
| US7627550B1 (en) * | 2006-09-15 | 2009-12-01 | Initiate Systems, Inc. | Method and system for comparing attributes such as personal names |
| US20100030715A1 (en) * | 2008-07-30 | 2010-02-04 | Kevin Francis Eustice | Social Network Model for Semantic Processing |
| US20100191782A1 (en) * | 2009-01-29 | 2010-07-29 | Brzozowski Michael J | Assigning content to an entry in directory |
| US20100293195A1 (en) * | 2009-05-12 | 2010-11-18 | Comcast Interactive Media, Llc | Disambiguation and Tagging of Entities |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090307057A1 (en) * | 2008-06-06 | 2009-12-10 | Albert Azout | Associative memory operators, methods and computer program products for using a social network for predictive marketing analysis |
| US8504481B2 (en) * | 2008-07-22 | 2013-08-06 | New Jersey Institute Of Technology | System and method for protecting user privacy using social inference protection techniques |
| US20100114887A1 (en) * | 2008-10-17 | 2010-05-06 | Google Inc. | Textual Disambiguation Using Social Connections |
| US8386574B2 (en) * | 2009-10-29 | 2013-02-26 | Xerox Corporation | Multi-modality classification for one-class classification in social networks |
-
2009
- 2009-11-20 US US12/623,311 patent/US8433762B1/en not_active Expired - Fee Related
-
2013
- 2013-02-27 US US13/779,574 patent/US20130173611A1/en not_active Abandoned
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6052724A (en) * | 1997-09-02 | 2000-04-18 | Novell Inc | Method and system for managing a directory service |
| US20030084103A1 (en) * | 2001-10-29 | 2003-05-01 | Comverse, Ltd. | Method and system for third-party initiation of an anonymous tele-chat session |
| US20040024760A1 (en) * | 2002-07-31 | 2004-02-05 | Phonetic Research Ltd. | System, method and computer program product for matching textual strings using language-biased normalisation, phonetic representation and correlation functions |
| US7627550B1 (en) * | 2006-09-15 | 2009-12-01 | Initiate Systems, Inc. | Method and system for comparing attributes such as personal names |
| US20080228735A1 (en) * | 2007-03-16 | 2008-09-18 | Expanse Networks, Inc. | Lifestyle Optimization and Behavior Modification |
| US20100030715A1 (en) * | 2008-07-30 | 2010-02-04 | Kevin Francis Eustice | Social Network Model for Semantic Processing |
| US20100191782A1 (en) * | 2009-01-29 | 2010-07-29 | Brzozowski Michael J | Assigning content to an entry in directory |
| US20100293195A1 (en) * | 2009-05-12 | 2010-11-18 | Comcast Interactive Media, Llc | Disambiguation and Tagging of Entities |
Non-Patent Citations (1)
| Title |
|---|
| Donna Roberts, "Complement of an Event," 2/2001, Oswego City School District Regents Exam Prep Center, www.regentsprep.org/regents/math/algebra/apr6/lcompl.htm * |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9047268B2 (en) | 2013-01-31 | 2015-06-02 | Google Inc. | Character and word level language models for out-of-vocabulary text input |
| US9454240B2 (en) | 2013-02-05 | 2016-09-27 | Google Inc. | Gesture keyboard input of non-dictionary character strings |
| US10095405B2 (en) | 2013-02-05 | 2018-10-09 | Google Llc | Gesture keyboard input of non-dictionary character strings |
| US8756499B1 (en) * | 2013-04-29 | 2014-06-17 | Google Inc. | Gesture keyboard input of non-dictionary character strings using substitute scoring |
| JP2018502404A (en) * | 2014-11-04 | 2018-01-25 | 華為技術有限公司Huawei Technologies Co.,Ltd. | Message display method, message display device, and message display device |
| US11095627B2 (en) | 2014-11-04 | 2021-08-17 | Huawei Technologies Co., Ltd. | Message display method, apparatus, and device |
| US10180937B2 (en) * | 2017-02-16 | 2019-01-15 | International Business Machines Corporation | Cognitive entity reference recognition |
| US10366162B2 (en) * | 2017-02-16 | 2019-07-30 | International Business Machines Corporation | Cognitive entity reference recognition |
Also Published As
| Publication number | Publication date |
|---|---|
| US8433762B1 (en) | 2013-04-30 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8433762B1 (en) | Generation of nickname dictionary based on analysis of user communications | |
| US20210349964A1 (en) | Predictive resource identification and phased delivery of structured documents | |
| US20150046152A1 (en) | Determining concept blocks based on context | |
| US20210279224A1 (en) | Combinators | |
| JP6419905B2 (en) | Using inverse operators on queries | |
| KR101419828B1 (en) | Composite term index for graph data | |
| US10354083B2 (en) | Social network site including trust-based wiki functionality | |
| KR101921650B1 (en) | Coefficients attribution for different objects based on natural language processing | |
| EP2764495B1 (en) | Social network recommended content and recommending members for personalized search results | |
| Oussalah et al. | A software architecture for Twitter collection, search and geolocation services | |
| CN102971762B (en) | Facilitate interaction between social network users | |
| US8533297B2 (en) | Setting cookies in conjunction with phased delivery of structured documents | |
| US7945862B2 (en) | Social network site including contact-based recommendation functionality | |
| CN103380421B (en) | Distributed cache for graph data | |
| CN100530177C (en) | Method, system, and apparatus for receiving and responding to knowledge interchange queries | |
| JP6407968B2 (en) | Variable search query vertical access | |
| US9407589B2 (en) | System and method for following topics in an electronic textual conversation | |
| US10331749B2 (en) | Selective presentation of content types and sources in search | |
| US20090070294A1 (en) | Social Networking Site Including Conversation Thread Viewing Functionality | |
| US20120239663A1 (en) | Perspective-based content filtering | |
| CN102792300A (en) | Customizable semantic search based on user roles | |
| CN103493045A (en) | Automated answers to online questions | |
| WO2022121227A1 (en) | Data storage method and apparatus, query method, electronic device, and readable medium | |
| US20150149487A1 (en) | Integrating Online Search Results and Social Networks | |
| Drakopoulos et al. | Evaluating Twitter Influence Ranking with System Theory. |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
| AS | Assignment |
Owner name: META PLATFORMS, INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:FACEBOOK, INC.;REEL/FRAME:058553/0802 Effective date: 20211028 |