这是indexloc提供的服务,不要输入任何密码
Skip to content

chat completion api support new model: gpt-4o-audio-preview #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README-zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@ OpenAi4J是一个非官方的Java库,旨在帮助java开发者与OpenAI的GPT
## 导入依赖
### Gradle

`implementation 'io.github.lambdua:<api|client|service>:0.22.4'`
`implementation 'io.github.lambdua:<api|client|service>:0.22.5'`
### Maven
```xml

<dependency>
<groupId>io.github.lambdua</groupId>
<artifactId>service</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
</dependency>
```

Expand Down Expand Up @@ -61,7 +61,7 @@ static void simpleChat() {
<dependency>
<groupId>io.github.lambdua</groupId>
<artifactId>api</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
</dependency>
```

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@ applications effortlessly.
## Import
### Gradle

`implementation 'io.github.lambdua:<api|client|service>:0.22.4'`
`implementation 'io.github.lambdua:<api|client|service>:0.22.5'`
### Maven
```xml

<dependency>
<groupId>io.github.lambdua</groupId>
<artifactId>service</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
</dependency>
```

Expand Down Expand Up @@ -67,7 +67,7 @@ To utilize pojos, import the api module:
<dependency>
<groupId>io.github.lambdua</groupId>
<artifactId>api</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
</dependency>
```

Expand Down
2 changes: 1 addition & 1 deletion api/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>io.github.lambdua</groupId>
<artifactId>openai-java</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
</parent>
<packaging>jar</packaging>
<artifactId>api</artifactId>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
package com.theokanning.openai.completion.chat;

import java.util.List;

import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.theokanning.openai.utils.JsonUtil;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;

import java.util.List;

/**
* @author LiangTao
* @date 2024年04月10 10:31
Expand Down Expand Up @@ -41,6 +40,10 @@ public class AssistantMessage implements ChatMessage {
*/
private String refusal;

/**
* Data about a previous audio response from the model.
*/
private AssistantMessageAudio audio;


public AssistantMessage(String content) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
package com.theokanning.openai.completion.chat;

import com.fasterxml.jackson.annotation.JsonProperty;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.NonNull;

/**
* @author Allen Hu
* @date 2024/11/6
*/
@Data
@NoArgsConstructor
@AllArgsConstructor
class AssistantMessageAudio {

/**
* Unique identifier for a previous audio response from the model.
*/
@NonNull
private String id;

/**
* The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
*/
@JsonProperty("expires_at")
private Integer expiresAt;

/**
* Transcript of the audio generated by the model.
*/
private String transcript;

/**
* Base64 encoded audio bytes generated by the model, in the format specified in the request.
*/
private String data;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package com.theokanning.openai.completion.chat;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;

/**
* Parameters for audio output. Required when audio output is requested with modalities: ["audio"]
*
* @author Allen Hu
* @date 2024/11/5
*/
@Data
@NoArgsConstructor
@AllArgsConstructor
public class Audio {

/**
* The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, sage, shimmer, and verse.
*/
String voice;

/**
* Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.
*/
String format;
}
Original file line number Diff line number Diff line change
Expand Up @@ -168,5 +168,18 @@ public class ChatCompletionRequest {
@JsonProperty("parallel_tool_calls")
Boolean parallelToolCalls;

/**
* Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default:
* ["text"]
* The gpt-4o-audio-preview model can also be used to generate audio. To request that this model generate both text and audio responses, you can use:
* ["text", "audio"]
*
* {@see https://platform.openai.com/docs/api-reference/chat/create#chat-create-modalities}
*/
List<String> modalities;

/**
* Parameters for audio output. Required when audio output is requested with modalities: ["audio"].
*/
Audio audio;
}
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ ImageContent parseContent(JsonParser jsonParser) throws IOException {
content.setImageUrl(parseImageUrl(jsonParser));
} else if ("image_file".equals(fieldName)) {
content.setImageFile(parseImageFile(jsonParser));
} else if ("input_audio".equals(fieldName)) {
content.setInputAudio(parseInputAudio(jsonParser));
}
}
return content;
Expand Down Expand Up @@ -83,4 +85,19 @@ private ImageUrl parseImageUrl(JsonParser jsonParser) throws IOException {
}
return new ImageUrl(url, detail);
}

private InputAudio parseInputAudio(JsonParser jsonParser) throws IOException {
String data = null;
String format = null;
while (jsonParser.nextToken() != JsonToken.END_OBJECT) {
String fieldName = jsonParser.getCurrentName();
jsonParser.nextToken();
if ("data".equals(fieldName)) {
data = jsonParser.getText();
} else if ("format".equals(fieldName)) {
format = jsonParser.getText();
}
}
return new InputAudio(data, format);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ public void serialize(Object o, JsonGenerator jsonGenerator, SerializerProvider
if (ic.getType().equals("image_file")) {
jsonGenerator.writeObjectField("image_file", ic.getImageFile());
}
if (ic.getType().equals("input_audio")) {
jsonGenerator.writeObjectField("input_audio", ic.getInputAudio());
}
jsonGenerator.writeEndObject();
}
jsonGenerator.writeEndArray();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
public class ImageContent {

/**
* The type of the content. Either "text" or "image_url".
* The type of the content. Either "text", "image_url" or "input_audio".
*/
@NonNull
private String type;
Expand All @@ -39,6 +39,10 @@ public class ImageContent {
@JsonProperty("image_file")
private ImageFile imageFile;

@JsonInclude(JsonInclude.Include.NON_NULL)
@JsonProperty("input_audio")
private InputAudio inputAudio;


public ImageContent(String text) {
this.type = "text";
Expand All @@ -50,14 +54,42 @@ public ImageContent(ImageUrl imageUrl) {
this.imageUrl = imageUrl;
}

/**
* @deprecated {@link #ofImagePath(Path)}
*/
@Deprecated
public ImageContent(Path imagePath){
this.type = "image_url";
String imagePathString = imagePath.toAbsolutePath().toString();
String extension = imagePathString.substring(imagePathString.lastIndexOf('.') + 1);
this.imageUrl=new ImageUrl( "data:image/" + extension + ";base64," + encodeImage(imagePath));
}

private String encodeImage(Path imagePath) {
public ImageContent(InputAudio inputAudio) {
this.type = "input_audio";
this.inputAudio = inputAudio;
}

public static ImageContent ofImagePath(Path imagePath){
String imagePathString = imagePath.toAbsolutePath().toString();
String extension = imagePathString.substring(imagePathString.lastIndexOf('.') + 1);
ImageUrl imageUrl = new ImageUrl("data:image/" + extension + ";base64," + encode2base64(imagePath));
return new ImageContent(imageUrl);
}

public static ImageContent ofAudioPath(Path inputAudioPath) {
String inputAudioPathString = inputAudioPath.toAbsolutePath().toString();
String extension = inputAudioPathString.substring(inputAudioPathString.lastIndexOf('.') + 1);
String base64 = encode2base64(inputAudioPath);
InputAudio inputAudio = new InputAudio(base64, extension);
return new ImageContent(inputAudio);
}

/**
* @deprecated use {@link #encode2base64(Path)}
*/
@Deprecated
private static String encodeImage(Path imagePath) {
byte[] fileContent;
try {
fileContent = Files.readAllBytes(imagePath);
Expand All @@ -67,4 +99,13 @@ private String encodeImage(Path imagePath) {
}
}

private static String encode2base64(Path path) {
byte[] fileContent;
try {
fileContent = Files.readAllBytes(path);
return Base64.getEncoder().encodeToString(fileContent);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
package com.theokanning.openai.completion.chat;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.NonNull;

/**
* @author Allen Hu
* @date 2024/11/6
*/
@Data
@NoArgsConstructor
@AllArgsConstructor
public class InputAudio {

/**
* Base64 encoded audio data.
*/
@NonNull
private String data;

/**
* The format of the encoded audio data. Currently supports "wav" and "mp3".
*/
@NonNull
private String format;
}
Original file line number Diff line number Diff line change
Expand Up @@ -83,11 +83,23 @@ public static UserMessage buildImageMessage(String prompt, String... imageUrls)
* @return com.theokanning.openai.completion.chat.UserMessage
**/
public static UserMessage buildImageMessage(String prompt, Path... imagePaths) {
List<ImageContent> imageContents = Arrays.stream(imagePaths).map(ImageContent::new).collect(Collectors.toList());
List<ImageContent> imageContents = Arrays.stream(imagePaths).map(ImageContent::ofImagePath).collect(Collectors.toList());
imageContents.add(0, new ImageContent(prompt));
return new UserMessage(imageContents);
}


/**
* 构建一个音频识别请求消息,支持多个音频
* @param prompt query text
* @param inputAudioPaths 音频文件本地路径
* @return com.theokanning.openai.completion.chat.UserMessage
* @author Allen Hu
* @date 2024/11/6
*/
public static UserMessage buildInputAudioMessage(String prompt, Path... inputAudioPaths) {
List<ImageContent> imageContents = Arrays.stream(inputAudioPaths).map(ImageContent::ofAudioPath).collect(Collectors.toList());
imageContents.add(0, new ImageContent(prompt));
return new UserMessage(imageContents);
}
}

2 changes: 1 addition & 1 deletion client/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>io.github.lambdua</groupId>
<artifactId>openai-java</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
</parent>
<packaging>jar</packaging>

Expand Down
4 changes: 2 additions & 2 deletions example/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

<groupId>io.github.lambdua</groupId>
<artifactId>example</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
<name>example</name>

<properties>
Expand All @@ -17,7 +17,7 @@
<dependency>
<groupId>io.github.lambdua</groupId>
<artifactId>service</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
</dependency>

</dependencies>
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

<groupId>io.github.lambdua</groupId>
<artifactId>openai-java</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
<packaging>pom</packaging>
<description>openai java 版本</description>
<url>https://github.com/Lambdua/openai-java</url>
Expand Down
2 changes: 1 addition & 1 deletion service/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>io.github.lambdua</groupId>
<artifactId>openai-java</artifactId>
<version>0.22.4</version>
<version>0.22.5</version>
</parent>
<packaging>jar</packaging>

Expand Down
Loading