Pulse · tensorflow/tensorflow · GitHub

这是indexloc提供的服务，不要输入任何密码

July 16, 2025 – July 23, 2025

Overview

327 Active pull requests

15 Active issues

220 Pull requests merged by 6 people

[xla:cpu] Make dot thunk support int8 x int8 -> int32 matmul.
#97435 merged Jul 23, 2025
Avoid registering definition events until after the allocation succeeds
#97418 merged Jul 23, 2025
Add Hermetic C++ Toolchains for Linux x86_64 builds.
#96820 merged Jul 23, 2025
Unbreak HloPassPipeline silent/noop change debugging.
#97431 merged Jul 23, 2025
[xla] Optimize ShapeTree construction time for arrays
#97394 merged Jul 23, 2025
Correct the required gpu to test_core_h100
#97352 merged Jul 23, 2025
Use mdformat on the XNNPack delegate readme.
#97182 merged Jul 23, 2025
This is an automatic update to a device compatibility allowlist.
#97423 merged Jul 23, 2025
[tfcompile] Prefix xla generated symbols with tfcompile_xla_generated.
#97427 merged Jul 23, 2025
[XLA] Add stack trace breakdown to HloLiveRange::ToString for peak memory usage
#94954 merged Jul 23, 2025
[XLA:GPU] Do not fail when profiling executables, instead ignore failing configurations.
#97344 merged Jul 23, 2025
Reverts e4e87da4c3ed8a529f67bb1e3485101c37a4e1f9
#97424 merged Jul 23, 2025
Guard against possible null pointer dereferences.
#97426 merged Jul 23, 2025
[XLA:Autotuner] Allow gpu_profiler to collect output buffers.
#97355 merged Jul 23, 2025
Automated Code Change
#97401 merged Jul 23, 2025
[XLA:GPU][host offloading] Implement host offloading thunks.
#97059 merged Jul 23, 2025
PR #28735: [XLA:GPU] Enabling cuda graph concurrent mode by default
#97045 merged Jul 23, 2025
[XLA:GPU] Early return if there are no elements to compare with.
#97402 merged Jul 23, 2025
[XLA:CPU] Run CSE after inlining in fusion compiler.
#97114 merged Jul 23, 2025
Reverts 1b7ced5a66c8948c8b6dfa984ddea56e99c3cba3
#97404 merged Jul 23, 2025
[triton] Avoid propagating slice layouts across broadcast.
#97350 merged Jul 23, 2025
Prevent XLA from crashing when a Literal is too big to fit in memory.
#97400 merged Jul 23, 2025
Refactor shardy_xla_pass_test with IsCustomCall. No behavior change.
#97388 merged Jul 23, 2025
[XLA][Numerics][HLO Value Tracking] Support original values during HLO and stableHLO round trip
#97148 merged Jul 23, 2025
Automated Code Change
#97342 merged Jul 23, 2025
[xla] ShapeUtil: optimize shape traversal
#97381 merged Jul 23, 2025
[Hlo Diff] Use full constant values when generating fingerpints and simplify the ExactSubgraphMatcher.
#96312 merged Jul 23, 2025
Fix a bug where doing a CPU jax.Array -> TPU jax.Array transfer was not setting the layout correctly on the output TPU array.
#97393 merged Jul 23, 2025
Change PjRtClient::LazyToLiteral to take a generator that returns a future of the literal
#97162 merged Jul 23, 2025
[HLO Diff] Fix bug in instruction users matching to use fingerprints instead of pointers.
#97292 merged Jul 23, 2025
Migrate away from ArrayRef(std::nullopt_t)
#97382 merged Jul 23, 2025
[tf] Migrate from std::map to absl::flat_hash_map
#97376 merged Jul 23, 2025
[IFRT] Define user_context() in Value and LoadedExecutable
#97153 merged Jul 23, 2025
[HLO Diff] Fix the order of operands in ?: operator in TopDownMatcher.
#97291 merged Jul 23, 2025
[HLO Diff] Fix existence check in TopDownMatcher.
#97290 merged Jul 23, 2025
XLA: Fix method ambiguity
#97359 merged Jul 23, 2025
[xla] ShapeUtil: avoid WithStatus callbacks on a hot path
#97380 merged Jul 23, 2025
Fix resource calculation for annotation groups that do not contain full pairs of start-done ops.
#97366 merged Jul 23, 2025
WithIndex: an iterator wrapper that adds iteration index to yielded value.
#97370 merged Jul 23, 2025
[IFRT] Remove the error message mentioning an invalid output target
#97390 merged Jul 23, 2025
[xla:tsl] Add Eigen contraction kernel template specialization for int8 LHS x int8 RHS -> int32 result.
#97383 merged Jul 23, 2025
Refactor StreamExecutorGpuTopologyDescription::DeviceDescriptions to share SetupDeviceDescription
#97378 merged Jul 23, 2025
Increase max stack depth of GPU compile callstack.
#97384 merged Jul 22, 2025
Update zlib/zstd patch after
#97369 merged Jul 22, 2025
Reverts c197694417be40d121b8b8ba9bb50efd81c4f227
#97375 merged Jul 22, 2025
#sdy update hasGspmdAttrsOrOps to look at the body of the main function.
#97358 merged Jul 22, 2025
[tf] Do not keep XlaComputation alive after it's compiled to Xla executable
#97364 merged Jul 22, 2025
Integrate LLVM at llvm/llvm-project@0a343098b0ea
#97327 merged Jul 22, 2025
[xla:cpu] Do not serialize HloProto if user didn't ask for it
#97363 merged Jul 22, 2025
Add reshard cache when the sharding of reshape is tile maximal (replicated or maximal).
#97317 merged Jul 22, 2025
Removed optimized batch matmul implementation from tflite in favor of XNNPACK. Also performed tiny refactoring along the way.
#97184 merged Jul 22, 2025
Internal change only.
#97195 merged Jul 22, 2025
[XLA:GPU] enable emitting dots through the generic emitter
#96720 merged Jul 22, 2025
Enable N-dimensional sparse tensor in tpu_embedding_v3.py
#97151 merged Jul 22, 2025
PR #29115: Bump github/codeql-action from 3.29.2 to 3.29.3
#97337 merged Jul 22, 2025
[XLA][host offloading] Open source annotate host compute offload.
#97362 merged Jul 22, 2025
[XLA:GPU] Replace "agent" memory sync scope with "device" for nVidia backend.
#97360 merged Jul 22, 2025
[XLA:CPU][XLA:GPU] Move concat fusion emitter to shared directory
#97050 merged Jul 22, 2025
Automated Code Change
#97339 merged Jul 22, 2025
This is an automatic update to a device compatibility allowlist.
#97346 merged Jul 22, 2025
[XLA:GPU] Remove horizontal fusion.
#96241 merged Jul 22, 2025
Automated Code Change
#97247 merged Jul 22, 2025
[XLA:GPU] Choose a tile size according to hardware constraints.
#97335 merged Jul 22, 2025
Use CONCURRENT_KERNEL tracing in KernelNameTracer
#97280 merged Jul 22, 2025
Fix bf16 propagation for host memory space
#97310 merged Jul 22, 2025
Avoid crash in RequestActivityBuffer
#97279 merged Jul 22, 2025
Automated Code Change
#97241 merged Jul 22, 2025
Add TPU v4i to GetDeviceMemoryInBytes()
#97316 merged Jul 22, 2025
[xla] ShapedBuffer: Don't compute on_host_shape if it's the same as device shape
#97323 merged Jul 22, 2025
Automated Code Change
#97248 merged Jul 22, 2025
Automated Code Change
#97025 merged Jul 22, 2025
Fixes minor bug with low memory limits in Peak Priority remat allowing them to be zero.
#97298 merged Jul 22, 2025
[xla:cpu] Refactor a few helper functions DotLibraryRewriter for readablility.
#97324 merged Jul 22, 2025
Remove AbstractCpuBuffer. All subclasses can be replaced with CommonPjRtBufferImpl and removed.
#97194 merged Jul 22, 2025
[xla:cpu] Add useful TraceMe annotations to CpuExecutable
#97318 merged Jul 22, 2025
[xla:cpu] Pass --xla_cpu_experimental_onednn_fusion_types to OneDnnMatcher.
#97320 merged Jul 22, 2025
[xla] Annotate likely branches with ABSL_PREDICT_TRUE
#97311 merged Jul 22, 2025
Add missing includes.
#96385 merged Jul 22, 2025
Add a warning log in xla::ShardingPropagation that it will be deprecated and replaced with Shardy.
#97299 merged Jul 22, 2025
Introduce AllGatherDynamicSliceShuffledOffsetSimplifier, a new HLO pass that collapse dynamic-slice(all-gather) with shuffled offset into collective-permute.
#97287 merged Jul 22, 2025
Updating necessary visibilities ahead of sa-train-google in-place migrations.
#97304 merged Jul 22, 2025
Introduce PickUnusedPort() that doesn't terminate when failing to find a port. The new function returns -1 when failing to find an unused port. The original PickUnusedPortOrDie() still terminates when a port can't be found.
#97308 merged Jul 21, 2025
[xla] Make Shape::Equal faster
#97296 merged Jul 21, 2025
lite: Add config option to enable benchmark_model
#96999 merged Jul 21, 2025
Add a RequiresQuantizedBiasInterface
#96580 merged Jul 21, 2025
Remove unused API
#97303 merged Jul 21, 2025
Update BUILD files with dependencies
#97305 merged Jul 21, 2025
Optimize BatchMatmul to Fully Connected when RHS is reshaped after dequantization.
#97205 merged Jul 21, 2025
Add SparseCore documentation
#97200 merged Jul 21, 2025
Pin python dependencies and update lock files
#97302 merged Jul 21, 2025
Update version numbers for TensorFlow 2.20.0-rc0
#97295 merged Jul 21, 2025
Use stablehlo precision config conversion for stablehlo ops
#97233 merged Jul 21, 2025
Replace implementation of TSL status matchers with Abseil.
#97283 merged Jul 21, 2025
[xla:cpu] Make se::HostStream fully synchronous
#97288 merged Jul 21, 2025
Skip flaky test on Windows.
#97286 merged Jul 21, 2025
Added WatchJobState RPC to coordination service.
#97077 merged Jul 21, 2025
Add individual test targets for tensorflow/core:legacy_lib_core_tests.
#97285 merged Jul 21, 2025
Updating XNNPACK readme to include the correct weight cache file path field.
#97289 merged Jul 21, 2025
Add 10 Maxtext-derived HLO-based benchmarks
#97132 merged Jul 21, 2025
[xla] Add benchmarks for Shape::Equal
#97293 merged Jul 21, 2025
Avoid heap allocation for the sub buffer address
#97230 merged Jul 21, 2025
Avoid checking captured_tensors' usage when deciding if
#97141 merged Jul 21, 2025
LatencyHidingScheduler: Only recalculate when we've touched an already-scheduled computation, and use computation-specific peak rather than module peak in statistics.
#97192 merged Jul 21, 2025
Reverts 0c0a40e9e8dd2f04ab9a66643ca628acdef5840d
#97281 merged Jul 21, 2025
#sdy delete round_trip_import dir since JAX export can now export a model with StableHLO + Shardy dialects.
#96423 merged Jul 21, 2025
#sdy use new API to pass dump-index and use MeshAttr::getMaximal
#97275 merged Jul 21, 2025
Avoid recomputation of pjrt_buffer->memory_space() in MakeMemoryKindFromPjRtBuffer.
#97068 merged Jul 21, 2025
[XLA:GPU] Use correct unroll factor in indexing map
#97276 merged Jul 21, 2025
Add missing dependencies.
#97261 merged Jul 21, 2025
[XLA] NFC: remove duplication in diag message
#97274 merged Jul 21, 2025
[XLA:GPU] Add Get method to GpuPerformanceModelOwning.
#97277 merged Jul 21, 2025
Reverts b96ee234c76e3f76341d0acec46e2e7d067cd5a5
#97273 merged Jul 21, 2025
[XLA:GPU][host offloading] Implement gpu host offloading allocator.
#97051 merged Jul 21, 2025
[XLA:GPU] Refactor tests of IndexingMap
#97138 merged Jul 21, 2025
Fix cost analysis on for output byte accessed when result is tuple
#96927 merged Jul 21, 2025
Remove LLVM dependency from KernelThunk
#97178 merged Jul 21, 2025
Fix a problem with a variable being used out of scope.
#97272 merged Jul 21, 2025
[XLA:GPU][Tiling] Use SmallVector<OneDimTile> to store tiling info.
#97229 merged Jul 21, 2025
Automated Code Change
#97268 merged Jul 21, 2025
Give better error in run_hlo_module if HLO has collectives.
#97208 merged Jul 21, 2025
Introduces a new utility function, MatchPermutedSliceAndPartitionOffset, to detect a pattern where a DynamicSlice consumes the output of an AllGather with a permuted set of offsets. This pattern is equivalent to a CollectivePermute and can be optimized accordingly.
#97189 merged Jul 20, 2025
Automated Code Change
#97259 merged Jul 20, 2025
Add sdy shardings in frontend_attributes alongside hlo shardings for extra wrapper main added in tf2xla bridge.
#96919 merged Jul 19, 2025
[XLA:GPU] Support control flow thunks in command buffer conversion pass. We only convert kWhile and kConditional thunks if all thunks in all brunches are convertible.
#97183 merged Jul 19, 2025
#sdy Fix forward of making XLA C++ changes so we can fall back to GSPMD in JAX export if the loaded module was lowered for GSPMD.
#96368 merged Jul 19, 2025
[XLA] Add helper function GetIndicesSpecForDynamicSlice to get indices spec for dynamic slice fed by all-gather, the spec includes the mapping from slice offsets to corresponding partition IDs(flattened-id).
#96805 merged Jul 19, 2025
Internal, visibility only changes to public code.
#97207 merged Jul 19, 2025
Add visibility to hlo_input_output_format
#96758 merged Jul 19, 2025
Use a literal sentinel value for kernel init failure
#97193 merged Jul 19, 2025
Reduce redundancy between StringTo* enum functions.
#97201 merged Jul 19, 2025
[XLA:CPU] Refactor Intrinsic and use it in all math intrinsics.
#97000 merged Jul 19, 2025
Integrate LLVM at llvm/llvm-project@06ae0c2a1086
#97154 merged Jul 19, 2025
Update nccl_archive BUILD file to fix TF GPU wheel build.
#97206 merged Jul 19, 2025
[XLA:GPU] Add a verifier to the GPU compiler before post-scheduling pipeline.
#97150 merged Jul 19, 2025
Use host callback in the CopyToHostFuture method in Async PjRt.
#97203 merged Jul 18, 2025
Add function ExtractDynamicSliceFromCollectiveUser to extract a dynamic slice user from a collective.
#96802 merged Jul 18, 2025
no external change
#97100 merged Jul 18, 2025
Reverts 849435a30d0487e415126507953575358ed3c4eb
#97190 merged Jul 18, 2025
Reverts 2a45c5b0c326e20eafe833df055326b39edadcf2
#97071 merged Jul 18, 2025
Bump sqlite to 3.50.3
#97191 merged Jul 18, 2025
Typo fix "perferred" -> "preferred".
#97198 merged Jul 18, 2025
PR #28257: [XLA:GPU] Update ONEAPI crosstool compiler wrapper
#97149 merged Jul 18, 2025
Use ASSERT_THAT to check pass.Run() result
#97164 merged Jul 18, 2025
Update the XNNPack delegate README.
#97181 merged Jul 18, 2025
Annotate some XLA:GPU flags as stable i.e. they should provide 6 month deprecation notice.
#97134 merged Jul 18, 2025
[XLA:GPU] Add a test for DotForInt4vsIdentityBF16ReturnsCorrectResult.
#97064 merged Jul 18, 2025
PR #28985: [XLA:GPU] Add shared_memory_per_block_optin device info member
#97140 merged Jul 18, 2025
Update README.md
#96902 merged Jul 18, 2025
Update dependencies to XNNPACK.
#97177 merged Jul 18, 2025
[XLA:GPU] Move Dot strength reduction out of algebraic simplifier
#97166 merged Jul 18, 2025
[XLA:GPU] Remove CHECK-CSE since it is not used.
#97129 merged Jul 18, 2025
#sdy improve the error messaging when importing and exporting sharding custom calls.
#97041 merged Jul 18, 2025
Introduce new helper function that produces device lists for iota tile assignment. Apply it in xla_sharding_util.cc.
#97176 merged Jul 18, 2025
Introduce stable flags and associated deprecation policy for XLA debug options.
#97049 merged Jul 18, 2025
Use GetInPlaceInputOutputPairs from AliasInfo instead of HloDataflowAnalysis.
#97170 merged Jul 18, 2025
Remove ifdef from ir_emitter_unnested and fix various clang-tidy warnings
#97127 merged Jul 18, 2025
Add TmaMetadata serialization support
#97103 merged Jul 18, 2025
Automated Code Change
#97109 merged Jul 18, 2025
Move GetInPlaceInputOutputPairs and related code to AliasInfo class (NFC).
#97119 merged Jul 18, 2025
Automated Code Change
#97123 merged Jul 18, 2025
Fix tests paths and visibility issue for tflite/converter
#97147 merged Jul 18, 2025
Remove leftover logging
#97145 merged Jul 18, 2025
Automated Code Change
#97033 merged Jul 18, 2025
Propagate context to the waiter destruction sequence, so that all contained operations execute with the correct context.
#97143 merged Jul 18, 2025
Update PjRtCpuExecutable to not rely on any internals of PjRtCpuBuffer.
#97146 merged Jul 18, 2025
Handle V2 xla::OpSharding in ExtractInputsForLogicalDevices and ParseAndValidateOutputSharding.
#97136 merged Jul 18, 2025
Exclude tensorflow/lite/mlir/lite protos definitions when compiling under LiteRT repo and enable LiteRT disbale_tf_lite_py by default
#97137 merged Jul 18, 2025
Update version to 2.21.0
#97079 merged Jul 18, 2025
[XLA:TPU] In MSA, when removing instructions, we need to remove their scoped allocations from PresetAssignments.
#96945 merged Jul 17, 2025
Modified python bindings to enable passing a probe_instrumentation_dir to support interpreter ops in eval_module. Consistent with StableHLO interpreter usage from command line
#97091 merged Jul 17, 2025
[XLA][host offloading] Return AsyncValue from HostOffloadingExecutable.
#96915 merged Jul 17, 2025
#sdy update dump names and add index as prefix so they would be clearer for users
#97117 merged Jul 17, 2025
[Autotuner] Add block level emitter backend for Triton fusion (3).
#96798 merged Jul 17, 2025
[IFRT] Add UserContextScope
#97012 merged Jul 17, 2025
Add ReleaseDeviceMemoryOwnership implementation based on
#97144 merged Jul 17, 2025
Migrate uses of XLA_TEST_BACKEND macros to use utilities in xla_test_backend_predicates.h
#97135 merged Jul 17, 2025
Correctly identify async start and done ops in latency hiding scheduler.
#97089 merged Jul 17, 2025
Close output shardings to respect allow_spmd_sharding_propagation_to_output flag set to default {false} value. Added multiple test variants to test shardy, use_compile_options_from_model.
#97126 merged Jul 17, 2025
[xla:cpu] Make DotLibraryRewriter support greedy fusion mode.
#96319 merged Jul 17, 2025
Internal change only
#97065 merged Jul 17, 2025
Optimize BM_GlobalDecreasingSizeBestFitHeap benchmark by up to 3%.
#97075 merged Jul 17, 2025
Update release notes for TensorFlow 2.20.0
#97080 merged Jul 17, 2025
Relax the folding size threshold to 200 MiB.
#97078 merged Jul 17, 2025
Update CommonPjRtBufferImpl to have specialized versions for both cpu->device
#97085 merged Jul 17, 2025
#sdy define the utils that JAX jaxlib will use to allow for falling back to GSPMD when loading an old checkpoint.
#97130 merged Jul 17, 2025
[Autotuner] Add block level emitter backend for Triton fusion (2).
#96796 merged Jul 17, 2025
Use ASSERT_THAT(..., IsOkAndHolds(true)) for consistency and correctness
#97005 merged Jul 17, 2025
fix(dtensor): guard against nullptr from TF_TensorData in ExtractSmallTensorValue
#96866 merged Jul 17, 2025
Reverts 812bb86d50b1cee5cf32ccb1629a49687e924ea5
#97098 merged Jul 17, 2025
Simplify ShouldSkipForSideEffect function in zero_sized_hlo_elimination.
#97101 merged Jul 17, 2025
[XLA:GPU] Remove unused DotSparsityRewriter.
#97128 merged Jul 17, 2025
Automated Code Change
#97122 merged Jul 17, 2025
[XLA:GPU] additional logging in triton fusion numeric verifier
#97056 merged Jul 17, 2025
[xla:gpu][triton] triton-xla-squeeze-dims pass improvements.
#97099 merged Jul 17, 2025
Automated Code Change
#96959 merged Jul 17, 2025
PR #28073: [XLA:GPU][oneAPI] Enable Level_zero support
#97022 merged Jul 17, 2025
Remove deprecated HloAliasAnalysis::Run method
#97044 merged Jul 17, 2025
Add serialization and deserialization for the cuDNN thunk
#96914 merged Jul 17, 2025
no external change
#96942 merged Jul 17, 2025
[xla] Optimize ShapeUtil::ForEach traverals
#97063 merged Jul 17, 2025
Support INT16 for PRelu op
#96899 merged Jul 17, 2025
[xla:tf] Check if device shape is already a host shape
#97018 merged Jul 17, 2025
Add int16 kernel support for DIV op
#96934 merged Jul 17, 2025
Rollback https://github.com/openxla/xla/commit/cf3dfa9723c4cd4e2b25a606207a201a95fe71db
#97074 merged Jul 17, 2025
Fix //tflite/converter/tests/... MLIR tests by fixing .bzl rules and redirecting tensorflow submodule
#97003 merged Jul 16, 2025
Update release notes at HEAD
#97073 merged Jul 16, 2025
Enable --flaky_test_attempts in release branch
#97076 merged Jul 16, 2025
Move op name longest prefix logic from annotation.cc to somewhere upper level
#93906 merged Jul 16, 2025
Internal change only
#96928 merged Jul 16, 2025
Refactor optimized div for int8 and uint8
#96933 merged Jul 16, 2025
Add Hermetic C++ Toolchains for Linux x86_64 builds.
#96803 merged Jul 16, 2025
Migrate uses of XLA_TEST_BACKEND macros to use utilities in xla_test_backend_predicates.h
#97006 merged Jul 16, 2025
[JAX]: rollforward. Add ability to add a transfer server factory to override
#97069 merged Jul 16, 2025
Update dependencies to XNNPACK and cpuinfo.
#96990 merged Jul 16, 2025
Complete the CommonPjRtBufferImpl implementation.
#97001 merged Jul 16, 2025
[xla] Move xla::Shape functions that are used on a hot path to header file
#97057 merged Jul 16, 2025
Increase the size of __tensorflow_core_lib_core_legacy_lib_core_all_tests to deflake CI.
#97061 merged Jul 16, 2025
Support composite unpack and pack legalization with dynamic shape
#97062 merged Jul 16, 2025
Reverts e52a31e166af020e465c7494a6353f098a65155c
#97066 merged Jul 16, 2025
Rollback for missing header
#97067 merged Jul 16, 2025

107 Pull requests opened by 4 people

No changes to 3rd party.
#97070 opened Jul 16, 2025
Remove `local_config_nvshmem` repository from XLA and Tensorflow WORKSPACE files.
#97082 opened Jul 17, 2025
Cache device on `PJRT_Buffer`.
#97083 opened Jul 17, 2025
[XLA:MSA] Add block allocations for program weights that are not aliased and single use.
#97084 opened Jul 17, 2025
PR #28883: [XLA:CPU][oneDNN] Add build flag to enable asynchronous support in oneDNN
#97115 opened Jul 17, 2025
Introduce --dump_tflite_model_dir to dump TFLite models in Delegate Test Suite (DTS)
#97118 opened Jul 17, 2025
Integrate LLVM at llvm/llvm-project@06ae0c2a1086
#97139 opened Jul 17, 2025
Add Metal LiteRt Tensor Buffer support
#97142 opened Jul 17, 2025
Remove redundant string conversion.
#97152 opened Jul 17, 2025
Optimize `HasCombinableReplicaGroup` and `xla::CheckReplicaGroups`.
#97155 opened Jul 18, 2025
[IFRT] Support XLA GPU flag overrides.
#97156 opened Jul 18, 2025
[XLA:MSA] Reduce available memory bandwidth for instruction that are overlapped with bandwidth limiting asynchronous instructions.
#97163 opened Jul 18, 2025
Remove HLO and Autotuner dependency from CublasLtMatmulThunk
#97180 opened Jul 18, 2025
[XLA] Use sort instead of btree in MakeFreeChunks.
#97185 opened Jul 18, 2025
Automated Code Change
#97188 opened Jul 18, 2025
IFRT proxy logging fix: Do not log error when Executable is destroyed before its metadata is queried by the server (and sent over to the client).
#97197 opened Jul 18, 2025
Handle negative permutations in IsTransposeTrivial.
#97199 opened Jul 18, 2025
Determine collective support based on #partitions
#97211 opened Jul 19, 2025
[xla:gpu][triton] In squeeze-dims pass, keep at least two dimensions.
#97223 opened Jul 19, 2025
Add a scratch implemention of muon
#97231 opened Jul 19, 2025
Automated Code Change
#97236 opened Jul 19, 2025
Automated Code Change
#97237 opened Jul 19, 2025
Automated Code Change
#97238 opened Jul 19, 2025
Add HloInstruction extractor for custom-call kernel metadata.
#97239 opened Jul 19, 2025
Automated Code Change
#97240 opened Jul 19, 2025
Automated Code Change
#97249 opened Jul 20, 2025
Automated Code Change
#97254 opened Jul 20, 2025
Fix stride overflow, and solution to issue number #97165
#97260 opened Jul 20, 2025
Have HloModule::Clone copy layout_canonicalization_callback.
#97262 opened Jul 20, 2025
Automated Code Change
#97265 opened Jul 21, 2025
Automated Code Change
#97267 opened Jul 21, 2025
Integrate LLVM at llvm/llvm-project@13f7786f72d1
#97271 opened Jul 21, 2025
Add an xla flag to lower ragged dots through the ragged dot fusion emitter
#97278 opened Jul 21, 2025
Integrate Triton up to [6af1c4b5](https://github.com/openai/triton/commits/6af1c4b507dfddd0d62cdc8613839f46bdf6acb0)
#97282 opened Jul 21, 2025
Integrate LLVM at llvm/llvm-project@13f7786f72d1
#97284 opened Jul 21, 2025
[HLO Diff] Use full constant values when generating fingerpints.
#97294 opened Jul 21, 2025
Update release notes for TensorFlow 2.19.1
#97297 opened Jul 21, 2025
Update `rules_ml_toolchain` dependency version.
#97300 opened Jul 21, 2025
[XLA:GPU] Remove horizontal fusion.
#97301 opened Jul 21, 2025
Migrate uses of `XLA_TEST_BACKEND` macros to use utilities in `xla_test_backend_predicates.h`
#97306 opened Jul 21, 2025
PR #28919: [NVIDIA GPU] [XLA_GPU_MS_COLLECTIVE] Round-robin stream assignment for async communications
#97307 opened Jul 21, 2025
Adding Metal delegate kernel
#97312 opened Jul 21, 2025
Integrate LLVM at llvm/llvm-project@c384ec431dd7
#97314 opened Jul 21, 2025
Add gemm_config for xnnpack gemms.
#97315 opened Jul 22, 2025
Fixed dependency cleaner support for `tf_kernel_library` so that it will update both `deps` and `gpu_deps`.
#97319 opened Jul 22, 2025
[SDY] dump source sharding info when output shardyDir is non-empty. This occurs when `xla_dump_to` flag is set and `xla_dump_hlo_pass_re` regex matches the shardy pass (see `getShardyDirIfShouldDump`).
#97321 opened Jul 22, 2025
Integrate LLVM at llvm/llvm-project@c384ec431dd7
#97322 opened Jul 22, 2025
Automated Code Change
#97326 opened Jul 22, 2025
Automated Code Change
#97330 opened Jul 22, 2025
Enable dynamic-slice offset based on multiply for `AllGatherPermutedDsSimplifierTest`, more specifically,
#97332 opened Jul 22, 2025
PR #27904: [XLA:GPU][oneAPI] Enable Clang compiler as the host compiler
#97333 opened Jul 22, 2025
Add utility function `MatchDsPadAllGather` to match the pattern of `DynamicSlice(Pad(AllGather))`.
#97336 opened Jul 22, 2025
[XLA:GPU] Migrate GemmAlgorithmPicker to new autotuner.
#97340 opened Jul 22, 2025
PR #28540: [XLA:GPU] Add support for pre-padded scales for block scaled dot custom call
#97347 opened Jul 22, 2025
Add `_XlaShardingV2` in `TPUPartitionedOps` and use if present for all op's as opposed to just `XlaSharding` op.
#97348 opened Jul 22, 2025
Improvements to KernelNameTracer
#97351 opened Jul 22, 2025
[XLA][host offloading] Correctly handle queueing tasks on stream.
#97353 opened Jul 22, 2025
[XLA:GPU] Enable strength reduction for s32xs32->s32 dots
#97356 opened Jul 22, 2025
[XLA:MGPU][Experimental] HLO -> Pallas.
#97357 opened Jul 22, 2025
Internal change only.
#97365 opened Jul 22, 2025
[ #HLODiff ]Fix a bug in hlo_diff_summary.cc
#97367 opened Jul 22, 2025
Turn DCHECK into runtime error. Because literal is user provided, it should
#97368 opened Jul 22, 2025
[XLA:CPU] Lower to xla.rsqrt for all rsqrt ops.
#97371 opened Jul 22, 2025
Add float data generation within given range to test utils
#97372 opened Jul 22, 2025
Add int8/int16 support for SQRT op to AEQ
#97373 opened Jul 22, 2025
[XLA:GPU] verify no collective deadlocks regardless if module has schedule
#97374 opened Jul 22, 2025
avoid removing some copy kernels.
#97377 opened Jul 22, 2025
[IFRT IR] Reduce log severity when dot_graph_dump_to=sponge outside of test
#97379 opened Jul 22, 2025
changes to the build process for litert
#97385 opened Jul 23, 2025
Add the skeleton for PropagateQsvPass in the tflite converter.
#97386 opened Jul 23, 2025
[xla] Change Shape::IsArray and Shape::IsArrayOrBuffer.
#97389 opened Jul 23, 2025
Automated Code Change
#97391 opened Jul 23, 2025
Parse and ignore 'mode' attribute on collectives.
#97392 opened Jul 23, 2025
Integrate LLVM at llvm/llvm-project@13f7786f72d1
#97395 opened Jul 23, 2025
Add an optimization pattern that tranform const<[a, 1]> @ <[1, b]> to <[1, b]> * const<[a, 1]>.
#97396 opened Jul 23, 2025
PR #29204: [XLA:GPU] Remove Stream Id from command buffer
#97397 opened Jul 23, 2025
Integrate LLVM at llvm/llvm-project@22b083539051
#97398 opened Jul 23, 2025
Automated Code Change
#97399 opened Jul 23, 2025
Remove a redundant condition in `mlir_hlo_to_hlo.cc`.
#97405 opened Jul 23, 2025
PR #26187: Simplify HloComputation::IsFusionComputation semantics.
#97406 opened Jul 23, 2025
Replace lite/tools/versioning:versioning with compiler/mlir/lite/tools/versioning:versioning
#97407 opened Jul 23, 2025
Automated Code Change
#97408 opened Jul 23, 2025
Automated Code Change
#97409 opened Jul 23, 2025
Automated Code Change
#97410 opened Jul 23, 2025
Automated Code Change
#97411 opened Jul 23, 2025
Automated Code Change
#97412 opened Jul 23, 2025
Simplify `get-tuple-element(tuple(A))` as `copy(A)` in mhlo->hlo conversion.
#97413 opened Jul 23, 2025
Automated Code Change
#97414 opened Jul 23, 2025
Automated Code Change
#97415 opened Jul 23, 2025
Automated Code Change
#97416 opened Jul 23, 2025
Automated Code Change
#97417 opened Jul 23, 2025
PR #28883: [XLA:CPU][oneDNN] Add build flag to enable asynchronous support in oneDNN
#97419 opened Jul 23, 2025
Automated Code Change
#97420 opened Jul 23, 2025
Automated Code Change
#97421 opened Jul 23, 2025
[XLA:GPU] Support different unroll factors than just 2.
#97422 opened Jul 23, 2025
[XLA] Improve hlo_live_range_test to use HLO textual form
#97425 opened Jul 23, 2025
Expose GPU tracing knobs for 3P
#97428 opened Jul 23, 2025
Make CustomCallThunk own the associated HloComputation
#97429 opened Jul 23, 2025
[XLA:GPU] Enable deviceless AutotunerPass.
#97430 opened Jul 23, 2025
Use file mapping on Windows for the XNNPack weight cache.
#97433 opened Jul 23, 2025
[XLA:GPU] Adjust indexing cost model heuristic for register usage.
#97434 opened Jul 23, 2025
[xla] Change Shape::element_type and Shape::array_or_buffer_element_type.
#97436 opened Jul 23, 2025
Extend the Profiler Interface in Autotuner.
#97437 opened Jul 23, 2025
Add Multi output reduce test
#97438 opened Jul 23, 2025
RaggedDot in HloEvaluator now tolerates inputs with different precision, like conv.
#97439 opened Jul 23, 2025
[tf] Use template parameter to pass async callback
#97440 opened Jul 23, 2025
Remove redundant conversion to std::string
#97441 opened Jul 23, 2025

5 Issues closed by 3 people

tensorflow load failed
#97432 closed Jul 23, 2025
how to use libtensorflowlite_c.so C API and delegate gpu opencl correctly?
#95795 closed Jul 23, 2025
Error in loading Tensorflow in python
#97345 closed Jul 22, 2025
Inconsistent NotEqual broadcasting behavior between CPU and GPU (CPU fails silently, GPU raises error)
#97227 closed Jul 19, 2025
graph execution error bug with tfm.nlp.layers.MultiHeadRelativeAttention
#94599 closed Jul 18, 2025

10 Issues opened by 5 people

tf.keras.Model have not attribute : submodules but the document still use the error attribute
#97403 opened Jul 23, 2025
TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0 CUDA_ERROR_INVALID_HANDLE
#97387 opened Jul 23, 2025
Request for Python 3.13 support in TensorFlow
#97361 opened Jul 22, 2025
Inconsistent behavior for `tf.raw_ops.NotEqual` between CPU and GPU with non-broadcastable shapes
#97204 opened Jul 18, 2025
could you add support of the new optimizer: Muon
#97187 opened Jul 18, 2025
`tf.nn.depthwise_conv2d` crashes with large `strides` values when ONEDNN is enabled
#97165 opened Jul 18, 2025
`tf.pow` returns inconsistent value on CPU vs GPU
#97125 opened Jul 17, 2025
`tf.nn.local_response_normalization` returns incorrect output
#97105 opened Jul 17, 2025
`tf.linalg.matrix_rank` produces inconsistent output on CPU vs GPU with `tol=6`
#97102 opened Jul 17, 2025
`tf.math.argmax` throws `InvalidArgumentError` with valid `axis` of `int16` dtype
#97096 opened Jul 17, 2025

47 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Add a Reflection Map to `emitc` class
#96263 commented on Jul 23, 2025 • 11 new comments
TensorFlow DLL failed to load with newer version of TF
#91656 commented on Jul 17, 2025 • 0 new comments
Fix compile error in tensorflow/python/tfcompile_wrapper.cc on s390x
#87676 commented on Jul 21, 2025 • 0 new comments
Move duplicate CUDA/XLA registration logs from INFO to VLOG
#89808 commented on Jul 21, 2025 • 0 new comments
Fix comparison functions and add unit tests
#94484 commented on Jul 21, 2025 • 0 new comments
Fix reading mapped names in LayerNameMapper
#94651 commented on Jul 21, 2025 • 0 new comments
Update Protobuf to 6.31.1
#95873 commented on Jul 22, 2025 • 0 new comments
Remove LiteRT modules from TF python deps.
#95991 commented on Jul 22, 2025 • 0 new comments
Add support for deserializing xplanes to Jaxlib
#96282 commented on Jul 23, 2025 • 0 new comments
Add validation for label probability distribution in softmax_cross_entropy_with_logits
#96387 commented on Jul 21, 2025 • 0 new comments
Call `UpdateGlobalProcessInfo` from PjRt IFRT client.
#96494 commented on Jul 21, 2025 • 0 new comments
Fixes L2Pool implementation to not average pooling region squares
#96599 commented on Jul 23, 2025 • 0 new comments
PR #19067: [XLA:CPU][oneDNN] Move simplification pass before oneDNN pass
#96617 commented on Jul 19, 2025 • 0 new comments
#sdy Remove MHLO shardings from round-trip export
#96640 commented on Jul 17, 2025 • 0 new comments
Don't export Shardy in MPMD before going to IFRT, since IFRT can support (and export) a module with StableHLO + Shardy.
#96648 commented on Jul 23, 2025 • 0 new comments
Use RPG's solution as a hint to CP-SAT
#96674 commented on Jul 17, 2025 • 0 new comments
[TFLite] Support 16KB page sizes alignment for libtensorflowlite_gpu_gl.so
#96702 commented on Jul 21, 2025 • 0 new comments
Automated Code Change
#96929 commented on Jul 23, 2025 • 0 new comments
Avoid crashing when LRU cache keys change.
#96930 commented on Jul 18, 2025 • 0 new comments
Integrate LLVM at llvm/llvm-project@0d5325bb203f
#96998 commented on Jul 16, 2025 • 0 new comments
Update deps:
#97035 commented on Jul 23, 2025 • 0 new comments
[XLA:GPU] Move the s4 unpacking sequence from llvm pass to int4->int8 pass
#97047 commented on Jul 23, 2025 • 0 new comments
Allow the chaining of state across MetricHookInterface instantiations for multiple compilations.
#97054 commented on Jul 17, 2025 • 0 new comments
[ #HLODiff ] Add support for manual node matching.
#97060 commented on Jul 22, 2025 • 0 new comments
`tf.experimental.numpy.cumsum` handles overflow inconsistently on CPU and GPU
#97042 commented on Jul 17, 2025 • 0 new comments
Core Dump When Training
#97016 commented on Jul 17, 2025 • 0 new comments
YoloX different Model Output for Python and Android
#95489 commented on Jul 17, 2025 • 0 new comments
Tensorflow 2.19 fails to load after Pyside6
#97058 commented on Jul 17, 2025 • 0 new comments
Fail to build libtensorflow_framework.so.2.20.0
#96569 commented on Jul 18, 2025 • 0 new comments
`tf.nn.conv2d_transpose` crashes with "Illegal instruction (core dumped)"
#93733 commented on Jul 18, 2025 • 0 new comments
[Compatibility][Upgrade] TensorFlow 2.x to 2.15.0: Dependency Conflict and Version Downgrade Issue
#96694 commented on Jul 19, 2025 • 0 new comments
Build Error While Compiling TensorFlow Lite Using CMake
#96654 commented on Jul 19, 2025 • 0 new comments
Memory leak in tf.data when iterating over Dataset.from_generator
#65675 commented on Jul 19, 2025 • 0 new comments
TensorFlow on RTX 5090
#89272 commented on Jul 19, 2025 • 0 new comments
GPU Not Detected by TensorFlow Despite Proper System Setup
#96707 commented on Jul 22, 2025 • 0 new comments
Some sorting related ops produce results inconsistent with NumPy when tensor contains NaN
#95235 commented on Jul 22, 2025 • 0 new comments
tf.data.experimental.prefetch_to_device has no effect inside tf.distribute.Strategy.distribute_datasets_from_function.
#94735 commented on Jul 22, 2025 • 0 new comments
`tf.linalg.solve` behaves inconsistently on GPU for singular matrices depending on shape
#94657 commented on Jul 22, 2025 • 0 new comments
lib new version not support 16kb pages in android
#96602 commented on Jul 22, 2025 • 0 new comments
Convolution: CPU memory increase with growing number of different sequence lengths
#62441 commented on Jul 23, 2025 • 0 new comments
Remove or update zh-cn translation from installation instructions
#62245 commented on Jul 23, 2025 • 0 new comments
tf.strings.to_number cannot convert positive integers prefixed with "+" when out_type is tf.int32 or tf.int64
#62191 commented on Jul 23, 2025 • 0 new comments
failed to build branch r2.13
#60716 commented on Jul 23, 2025 • 0 new comments
Incorrect gradient in divide_no_nan and reciprocal_no_nan when divide by 0
#60715 commented on Jul 23, 2025 • 0 new comments
`tf.split` or `tf.transpose` cause errors for quantize-aware training with `quantize_apply`
#60714 commented on Jul 23, 2025 • 0 new comments
Dataset.ragged_batch does not produce correct specs with tf.py_function and tf.numpy_function
#60710 commented on Jul 23, 2025 • 0 new comments
Will TF supprot triton at future
#96876 commented on Jul 23, 2025 • 0 new comments