-
Gemma 3 Technical Report
Authors:
Gemma Team,
Aishwarya Kamath,
Johan Ferret,
Shreya Pathak,
Nino Vieillard,
Ramona Merhej,
Sarah Perrin,
Tatiana Matejovicova,
Alexandre Ramé,
Morgane Rivière,
Louis Rouillard,
Thomas Mesnard,
Geoffrey Cideron,
Jean-bastien Grill,
Sabela Ramos,
Edouard Yvinec,
Michelle Casbon,
Etienne Pot,
Ivo Penchev,
Gaël Liu,
Francesco Visin,
Kathleen Kenealy,
Lucas Beyer,
Xiaohai Zhai,
Anton Tsitsulin
, et al. (191 additional authors not shown)
Abstract:
We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters. This version introduces vision understanding abilities, a wider coverage of languages and longer context - at least 128K tokens. We also change the architecture of the model to reduce the KV-cache memory that tends to explode with long context. This is achie…
▽ More
We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters. This version introduces vision understanding abilities, a wider coverage of languages and longer context - at least 128K tokens. We also change the architecture of the model to reduce the KV-cache memory that tends to explode with long context. This is achieved by increasing the ratio of local to global attention layers, and keeping the span on local attention short. The Gemma 3 models are trained with distillation and achieve superior performance to Gemma 2 for both pre-trained and instruction finetuned versions. In particular, our novel post-training recipe significantly improves the math, chat, instruction-following and multilingual abilities, making Gemma3-4B-IT competitive with Gemma2-27B-IT and Gemma3-27B-IT comparable to Gemini-1.5-Pro across benchmarks. We release all our models to the community.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Perfect matching cuts partitioning a graph into complementary subgraphs
Authors:
Diane Castonguay,
Erika M. M. Coelho,
Hebert Coelho,
Julliano R. Nascimento,
Uéverton S. Souza
Abstract:
In Partition Into Complementary Subgraphs (Comp-Sub) we are given a graph $G=(V,E)$, and an edge set property $Π$, and asked whether $G$ can be decomposed into two graphs, $H$ and its complement $\overline{H}$, for some graph $H$, in such a way that the edge cut $[V(H),V(\overline{H})]$ satisfies the property $Π$. Motivated by previous work, we consider Comp-Sub($Π$) when the property…
▽ More
In Partition Into Complementary Subgraphs (Comp-Sub) we are given a graph $G=(V,E)$, and an edge set property $Π$, and asked whether $G$ can be decomposed into two graphs, $H$ and its complement $\overline{H}$, for some graph $H$, in such a way that the edge cut $[V(H),V(\overline{H})]$ satisfies the property $Π$. Motivated by previous work, we consider Comp-Sub($Π$) when the property $Π=\mathcal{PM}$ specifies that the edge cut of the decomposition is a perfect matching. We prove that Comp-Sub($\mathcal{PM}$) is GI-hard when the graph $G$ is $\{C_{k\geq 7}, \overline{C}_{k\geq 7} \}$-free. On the other hand, we show that Comp-Sub($\mathcal{PM}$) is polynomial-time solvable on $hole$-free graphs and on $P_5$-free graphs. Furthermore, we present characterizations of Comp-Sub($\mathcal{PM}$) on chordal, distance-hereditary, and extended $P_4$-laden graphs.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
Characterizing the Experience of Subjects in Software Engineering Studies
Authors:
Rafael de Mello,
Matheus Coelho
Abstract:
Context: Empirical studies in software engineering are typically centered on human subjects, ranging from novice to experienced developers. The experience of these individuals is a key context factor that should be properly characterized for supporting the design of empirical studies and interpreting their results. However, the criteria adopted for characterizing the experience of subjects do not…
▽ More
Context: Empirical studies in software engineering are typically centered on human subjects, ranging from novice to experienced developers. The experience of these individuals is a key context factor that should be properly characterized for supporting the design of empirical studies and interpreting their results. However, the criteria adopted for characterizing the experience of subjects do not follow a standard and are frequently limited. Goal: Our research aims at establishing an optimized and comprehensive scheme to characterize the subjects' experience for studies in software engineering. Method: Based on previous work, we defined the first version of this scheme, composed of three experience attributes, including time, number of projects, and self-perception. In the last years, we applied the characterization scheme over four empirical studies, reaching the characterization of 79 subjects in three different skills. Results: We found that the attributes from our scheme are positively but moderately correlated. This finding suggests these attributes play a complementary role in characterizing the subjects' experience. Besides, we found that study subjects tend to enumerate the technical diversity of their background when summarizing their professional experience. Conclusion: The scheme proposed represents a feasible alternative for characterizing subjects of empirical studies in the field. However, we intend to conduct additional investigations with developers to evolve it.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
Estudo comparativo de meta-heurísticas para problemas de colorações de grafos
Authors:
Flávio José Mendes Coelho
Abstract:
A classic graph coloring problem is to assign colors to vertices of any graph so that distinct colors are assigned to adjacent vertices. Optimal graph coloring colors a graph with a minimum number of colors, which is its chromatic number. Finding out the chromatic number is a combinatorial optimization problem proven to be computationally intractable, which implies that no algorithm that computes…
▽ More
A classic graph coloring problem is to assign colors to vertices of any graph so that distinct colors are assigned to adjacent vertices. Optimal graph coloring colors a graph with a minimum number of colors, which is its chromatic number. Finding out the chromatic number is a combinatorial optimization problem proven to be computationally intractable, which implies that no algorithm that computes large instances of the problem in a reasonable time is known. For this reason, approximate methods and metaheuristics form a set of techniques that do not guarantee optimality but obtain good solutions in a reasonable time. This paper reports a comparative study of the Hill-Climbing, Simulated Annealing, Tabu Search, and Iterated Local Search metaheuristics for the classic graph coloring problem considering its time efficiency for processing the DSJC125 and DSJC250 instances of the DIMACS benchmark.
△ Less
Submitted 18 December, 2019;
originally announced December 2019.
-
A note on the convexity number for complementary prisms
Authors:
Diane Castonguay,
Erika M. M. Coelho,
Hebert Coelho,
Julliano R. Nascimento
Abstract:
In the geodetic convexity, a set of vertices $S$ of a graph $G$ is $\textit{convex}$ if all vertices belonging to any shortest path between two vertices of $S$ lie in $S$. The cardinality $con(G)$ of a maximum proper convex set $S$ of $G$ is the $\textit{convexity number}$ of $G$. The $\textit{complementary prism}$ $G\overline{G}$ of a graph $G$ arises from the disjoint union of the graph $G$ and…
▽ More
In the geodetic convexity, a set of vertices $S$ of a graph $G$ is $\textit{convex}$ if all vertices belonging to any shortest path between two vertices of $S$ lie in $S$. The cardinality $con(G)$ of a maximum proper convex set $S$ of $G$ is the $\textit{convexity number}$ of $G$. The $\textit{complementary prism}$ $G\overline{G}$ of a graph $G$ arises from the disjoint union of the graph $G$ and $\overline{G}$ by adding the edges of a perfect matching between the corresponding vertices of $G$ and $\overline{G}$. In this work, we we prove that the decision problem related to the convexity number is NP-complete even restricted to complementary prisms, we determine $con(G\overline{G})$ when $G$ is disconnected or $G$ is a cograph, and we present a lower bound when $diam(G) \neq 3$.
△ Less
Submitted 13 July, 2019; v1 submitted 21 September, 2018;
originally announced September 2018.
-
On the Geodetic Hull Number of Complementary Prisms
Authors:
Erika M. M. Coelho,
Hebert Coelho,
Julliano R. Nascimento,
Jayme L. Szwarcfiter
Abstract:
Let $G$ be a finite, simple, and undirected graph and let $S$ be a set of vertices of $G$. In the geodetic convexity, a set of vertices $S$ of a graph $G$ is convex if all vertices belonging to any shortest path between two vertices of $S$ lie in $S$. The convex hull $H(S)$ of $S$ is the smallest convex set containing $S$. If $H(S) = V(G)$, then $S$ is a hull set. The cardinality $h(G)$ of a minim…
▽ More
Let $G$ be a finite, simple, and undirected graph and let $S$ be a set of vertices of $G$. In the geodetic convexity, a set of vertices $S$ of a graph $G$ is convex if all vertices belonging to any shortest path between two vertices of $S$ lie in $S$. The convex hull $H(S)$ of $S$ is the smallest convex set containing $S$. If $H(S) = V(G)$, then $S$ is a hull set. The cardinality $h(G)$ of a minimum hull set of $G$ is the hull number of $G$. The complementary prism $G\overline{G}$ of a graph $G$ arises from the disjoint union of the graph $G$ and $\overline{G}$ by adding the edges of a perfect matching between the corresponding vertices of $G$ and $\overline{G}$. Motivated by previous work, we determine and present lower and upper bounds on the hull number of complementary prisms of trees, disconnected graphs and cographs. We also show that the hull number on complementary prisms cannot be limited in the geodetic convexity, unlike the $P_3$-convexity.
△ Less
Submitted 22 July, 2018;
originally announced July 2018.
-
Error concealment by means of motion refinement and regularized Bregman divergence
Authors:
Alessandra M. Coelho,
Vania V. Estrela,
Felipe P. do Carmo,
Sandro R. Fernandes
Abstract:
This work addresses the problem of error concealment in video transmission systems over noisy channels employing Bregman divergences along with regularization. Error concealment intends to improve the effects of disturbances at the reception due to bit-errors or cell loss in packet networks. Bregman regularization gives accurate answers after just some iterations with fast convergence, better accu…
▽ More
This work addresses the problem of error concealment in video transmission systems over noisy channels employing Bregman divergences along with regularization. Error concealment intends to improve the effects of disturbances at the reception due to bit-errors or cell loss in packet networks. Bregman regularization gives accurate answers after just some iterations with fast convergence, better accuracy, and stability. This technique has an adaptive nature: the regularization functional is updated according to Bregman functions that change from iteration to iteration according to the nature of the neighborhood under study at iteration n. Numerical experiments show that high-quality regularization parameter estimates can be obtained. The convergence is sped up while turning the regularization parameter estimation less empiric, and more automatic.
△ Less
Submitted 10 November, 2016;
originally announced November 2016.
-
EM-Based Mixture Models Applied to Video Event Detection
Authors:
Alessandra Martins Coelho,
Vania V. Estrela
Abstract:
Surveillance system (SS) development requires hi-tech support to prevail over the shortcomings related to the massive quantity of visual information from SSs. Anything but reduced human monitoring became impossible by means of its physical and economic implications, and an advance towards an automated surveillance becomes the only way out. When it comes to a computer vision system, automatic video…
▽ More
Surveillance system (SS) development requires hi-tech support to prevail over the shortcomings related to the massive quantity of visual information from SSs. Anything but reduced human monitoring became impossible by means of its physical and economic implications, and an advance towards an automated surveillance becomes the only way out. When it comes to a computer vision system, automatic video event comprehension is a challenging task due to motion clutter, event understanding under complex scenes, multilevel semantic event inference, contextualization of events and views obtained from multiple cameras, unevenness of motion scales, shape changes, occlusions and object interactions among lots of other impairments. In recent years, state-of-the-art models for video event classification and recognition include modeling events to discern context, detecting incidents with only one camera, low-level feature extraction and description, high-level semantic event classification, and recognition. Even so, it is still very burdensome to recuperate or label a specific video part relying solely on its content. Principal component analysis (PCA) has been widely known and used, but when combined with other techniques such as the expectation-maximization (EM) algorithm its computation becomes more efficient. This chapter introduces advances associated with the concept of Probabilistic PCA (PPCA) analysis of video event and it also aims at looking closely to ways and metrics to evaluate these less intensive EM implementations of PCA and KPCA.
△ Less
Submitted 10 October, 2016;
originally announced October 2016.
-
Blind signal separation and identification of mixtures of images
Authors:
Felipe P. do Carmo,
Joaquim T. de Assis,
Vania V. Estrela,
Alessandra M. Coelho
Abstract:
In this paper, a fresh procedure to handle image mixtures by means of blind signal separation relying on a combination of second order and higher order statistics techniques are introduced. The problem of blind signal separation is reassigned to the wavelet domain. The key idea behind this method is that the image mixture can be decomposed into the sum of uncorrelated and/or independent sub-bands…
▽ More
In this paper, a fresh procedure to handle image mixtures by means of blind signal separation relying on a combination of second order and higher order statistics techniques are introduced. The problem of blind signal separation is reassigned to the wavelet domain. The key idea behind this method is that the image mixture can be decomposed into the sum of uncorrelated and/or independent sub-bands using wavelet transform. Initially, the observed image is pre-whitened in the space domain. Afterwards, an initial separation matrix is estimated from the second order statistics de-correlation model in the wavelet domain. Later, this matrix will be used as an initial separation matrix for the higher order statistics stage in order to find the best separation matrix. The suggested algorithm was tested using natural images.Experiments have confirmed that the use of the proposed process provides promising outcomes in identifying an image from noisy mixtures of images.
△ Less
Submitted 26 March, 2016;
originally announced March 2016.
-
State-of-the Art Motion Estimation in the Context of 3D TV
Authors:
Vania V. Estrela,
Alessandra M. Coelho
Abstract:
Progress in image sensors and computation power has fueled studies to improve acquisition, processing, and analysis of 3D streams along with 3D scenes/objects reconstruction. The role of motion compensation/motion estimation (MCME) in 3D TV from end-to-end user is investigated in this chapter. Motion vectors (MVs) are closely related to the concept of disparities, and they can help improving dynam…
▽ More
Progress in image sensors and computation power has fueled studies to improve acquisition, processing, and analysis of 3D streams along with 3D scenes/objects reconstruction. The role of motion compensation/motion estimation (MCME) in 3D TV from end-to-end user is investigated in this chapter. Motion vectors (MVs) are closely related to the concept of disparities, and they can help improving dynamic scene acquisition, content creation, 2D to 3D conversion, compression coding, decompression/decoding, scene rendering, error concealment, virtual/augmented reality handling, intelligent content retrieval, and displaying. Although there are different 3D shape extraction methods, this chapter focuses mostly on shape-from-motion (SfM) techniques due to their relevance to 3D TV. SfM extraction can restore 3D shape information from a single camera data.
△ Less
Submitted 23 December, 2013;
originally announced December 2013.
-
Content-Based Filtering for Video Sharing Social Networks
Authors:
Eduardo Valle,
Sandra de Avila,
Antonio da Luz Jr.,
Fillipe de Souza,
Marcelo Coelho,
Arnaldo Araújo
Abstract:
In this paper we compare the use of several features in the task of content filtering for video social networks, a very challenging task, not only because the unwanted content is related to very high-level semantic concepts (e.g., pornography, violence, etc.) but also because videos from social networks are extremely assorted, preventing the use of constrained a priori information. We propose a si…
▽ More
In this paper we compare the use of several features in the task of content filtering for video social networks, a very challenging task, not only because the unwanted content is related to very high-level semantic concepts (e.g., pornography, violence, etc.) but also because videos from social networks are extremely assorted, preventing the use of constrained a priori information. We propose a simple method, able to combine diverse evidence, coming from different features and various video elements (entire video, shots, frames, keyframes, etc.). We evaluate our method in three social network applications, related to the detection of unwanted content - pornographic videos, violent videos, and videos posted to artificially manipulate popularity scores. Using challenging test databases, we show that this simple scheme is able to obtain good results, provided that adequate features are chosen. Moreover, we establish a representation using codebooks of spatiotemporal local descriptors as critical to the success of the method in all three contexts. This is consequential, since the state-of-the-art still relies heavily on static features for the tasks addressed.
△ Less
Submitted 12 January, 2011;
originally announced January 2011.