{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":112542515,"defaultBranch":"main","name":"cutlass","ownerLogin":"NVIDIA","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2017-11-30T00:11:24.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/1728152?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1724962544.0","currentOid":""},"activityList":{"items":[{"before":"2991ce18d3121a10e622f5adcccd5859f83190cd","after":"44dae8b90ef232ea663727470dfbbe9daff6972d","ref":"refs/heads/main","pushedAt":"2024-09-19T15:40:30.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Adjust profiler space for SM89 (#1553)","shortMessageHtmlLink":"Adjust profiler space for SM89 (#1553)"}},{"before":"1ebda1ccef14df97da9ed098bd20e0c8520d6972","after":"2991ce18d3121a10e622f5adcccd5859f83190cd","ref":"refs/heads/main","pushedAt":"2024-09-18T14:37:24.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Add print_svg for mma (#1733)\n\n* add print_svg for mma\r\n\r\n* correct the code indentation","shortMessageHtmlLink":"Add print_svg for mma (#1733)"}},{"before":"9f68995de585a883e3ff6b1d0347ea02aff55451","after":"1ebda1ccef14df97da9ed098bd20e0c8520d6972","ref":"refs/heads/main","pushedAt":"2024-09-16T16:38:42.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Fix MMA promotion interval assertions (#1641)","shortMessageHtmlLink":"Fix MMA promotion interval assertions (#1641)"}},{"before":"3a8c01a18b24c35b216922481ac762496720a99d","after":"9f68995de585a883e3ff6b1d0347ea02aff55451","ref":"refs/heads/main","pushedAt":"2024-09-16T15:55:09.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"add publication: ‘EVT: Accelerating Deep Learning Training with Epilogue Visitor Tree’ (#1526)\n\nCo-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>","shortMessageHtmlLink":"add publication: ‘EVT: Accelerating Deep Learning Training with Epilo…"}},{"before":"dbdae514e03f83968f8b7dd4fb064071b9bfbdd1","after":"3a8c01a18b24c35b216922481ac762496720a99d","ref":"refs/heads/main","pushedAt":"2024-09-11T17:33:56.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Prefix a member template name with the template keyword. (#1796)\n\nFixes llvm buld error.","shortMessageHtmlLink":"Prefix a member template name with the template keyword. (#1796)"}},{"before":"21d0534167d71c806af7f88d70ba024cb85f34c3","after":"dbdae514e03f83968f8b7dd4fb064071b9bfbdd1","ref":"refs/heads/main","pushedAt":"2024-09-11T04:07:31.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Support for TMA Epilogue for Group Gemm and add pingpong ptr array & Group Gemm (#1795)","shortMessageHtmlLink":"Support for TMA Epilogue for Group Gemm and add pingpong ptr array & …"}},{"before":"323c8170bffdd11d774437b450e42d842e203517","after":"21d0534167d71c806af7f88d70ba024cb85f34c3","ref":"refs/heads/main","pushedAt":"2024-09-09T18:05:28.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"fix assertion (#1790)","shortMessageHtmlLink":"fix assertion (#1790)"}},{"before":"82f5075946e2569589439d500733b700a3141374","after":"323c8170bffdd11d774437b450e42d842e203517","ref":"refs/heads/main","pushedAt":"2024-09-06T03:25:03.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Support ComputeFn where output type differs from input type (#1771)\n\nThis is useful for e.g. function taking in 2 float inputs and turn them to complex","shortMessageHtmlLink":"Support ComputeFn where output type differs from input type (#1771)"}},{"before":"06e337758dd2c01b06930c543fcb0dfb7781cb93","after":"82f5075946e2569589439d500733b700a3141374","ref":"refs/heads/main","pushedAt":"2024-09-06T03:24:10.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"set_slice3x3 -> set_slice_3x3 (#1784)","shortMessageHtmlLink":"set_slice3x3 -> set_slice_3x3 (#1784)"}},{"before":"7369adcaca5b9db84ec04b6f52a8d1f8ef968e8d","after":"06e337758dd2c01b06930c543fcb0dfb7781cb93","ref":"refs/heads/main","pushedAt":"2024-09-05T21:14:15.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Remove extraneous comma in declaration (#1776)","shortMessageHtmlLink":"Remove extraneous comma in declaration (#1776)"}},{"before":"6c3044136b6462d0ff028ece1c1a83bb90a5b3aa","after":"7369adcaca5b9db84ec04b6f52a8d1f8ef968e8d","ref":"refs/heads/main","pushedAt":"2024-09-04T19:11:24.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Add Sm90LinCombPerColBias (#1774)\n\nCo-authored-by: Jiayu Sun ","shortMessageHtmlLink":"Add Sm90LinCombPerColBias (#1774)"}},{"before":"e1976daacc7b030ba672217eb5d96f5a663df4ab","after":"6c3044136b6462d0ff028ece1c1a83bb90a5b3aa","ref":"refs/heads/main","pushedAt":"2024-09-04T18:52:11.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Update barrier.h (#1782)","shortMessageHtmlLink":"Update barrier.h (#1782)"}},{"before":"f7b19de32c5d1f3cedfc735c2849f12b537522ee","after":"e1976daacc7b030ba672217eb5d96f5a663df4ab","ref":"refs/heads/main","pushedAt":"2024-08-30T03:11:06.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Add support for mixed 4-bit/8-bit data types GEMM (#1413)\n\n* Add support for mixed 4-bit/8-bit data types GEMM\r\n\r\n* fix ( and )\r\n\r\n---------\r\n\r\nCo-authored-by: Aleksandar Samardžić \r\nCo-authored-by: Haicheng Wu ","shortMessageHtmlLink":"Add support for mixed 4-bit/8-bit data types GEMM (#1413)"}},{"before":"4dbf5dbed2331b948b75a3dbeaf760d76b3b5964","after":"f7b19de32c5d1f3cedfc735c2849f12b537522ee","ref":"refs/heads/main","pushedAt":"2024-08-20T02:21:42.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"minor fix for a double quote in CMakeLists.txt (#1727)","shortMessageHtmlLink":"minor fix for a double quote in CMakeLists.txt (#1727)"}},{"before":"f93a69134ec8259fd235f220209d6f8734a5cb06","after":"4dbf5dbed2331b948b75a3dbeaf760d76b3b5964","ref":"refs/heads/main","pushedAt":"2024-08-19T17:26:09.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Use CUDA runtime API to retrieve function pointer to driver API (#1700)\n\n* Query pfn to driver api\n\n* use default for older toolkits\n\n---------\n\nCo-authored-by: shunfans ","shortMessageHtmlLink":"Use CUDA runtime API to retrieve function pointer to driver API (#1700)"}},{"before":"3f084f7f3c07d18066fb971823009aad9e00f77d","after":"f93a69134ec8259fd235f220209d6f8734a5cb06","ref":"refs/heads/main","pushedAt":"2024-08-16T12:15:00.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"d-k-b","name":"Dustyn Blasig","path":"/d-k-b","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15810718?s=80&v=4"},"commit":{"message":"Merge pull request #1714 from NVIDIA/u128_div\n\nfix uint128","shortMessageHtmlLink":"Merge pull request #1714 from NVIDIA/u128_div"}},{"before":"865be73a97bd9594092a2f7cf6719e0b3e5ab210","after":"3f084f7f3c07d18066fb971823009aad9e00f77d","ref":"refs/heads/main","pushedAt":"2024-08-16T04:59:29.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Add couple configs into generator.py for mixed input MM (#1350)\n\n* Add couple configs into generator.py for mixed input MM\r\n\r\n* change one unit test name; reenable 128x32 in the profiler\r\n\r\n* Added U8/BF16 tests.\r\n\r\n---------\r\n\r\nCo-authored-by: Haicheng Wu \r\nCo-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>","shortMessageHtmlLink":"Add couple configs into generator.py for mixed input MM (#1350)"}},{"before":null,"after":"b0296bf682375b56401c17901693218c54b11b8b","ref":"refs/heads/u128_div","pushedAt":"2024-08-16T04:06:36.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"fix uint128","shortMessageHtmlLink":"fix uint128"}},{"before":"8d8cfdf37560b6d799685daf80bee18c96114732","after":null,"ref":"refs/heads/351_sparse_update","pushedAt":"2024-08-15T16:45:21.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"}},{"before":"fb170439e86220903960269abd4dbcbe31a4ff6f","after":"865be73a97bd9594092a2f7cf6719e0b3e5ab210","ref":"refs/heads/main","pushedAt":"2024-08-15T16:44:49.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"d-k-b","name":"Dustyn Blasig","path":"/d-k-b","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15810718?s=80&v=4"},"commit":{"message":"Merge pull request #1713 from NVIDIA/351_sparse_update\n\nupdate 3.5.1 readme/changelog","shortMessageHtmlLink":"Merge pull request #1713 from NVIDIA/351_sparse_update"}},{"before":null,"after":"8d8cfdf37560b6d799685daf80bee18c96114732","ref":"refs/heads/351_sparse_update","pushedAt":"2024-08-15T04:14:09.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"update 3.5.1 readme/changelog","shortMessageHtmlLink":"update 3.5.1 readme/changelog"}},{"before":"4e5a8f6853817e6595189e712a8018e1b71e4380","after":"fb170439e86220903960269abd4dbcbe31a4ff6f","ref":"refs/heads/main","pushedAt":"2024-08-14T18:59:59.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Update half.h (#1709)","shortMessageHtmlLink":"Update half.h (#1709)"}},{"before":"7192f4ab230bb721fa8d4d3df33886dbe86cdc59","after":"4e5a8f6853817e6595189e712a8018e1b71e4380","ref":"refs/heads/main","pushedAt":"2024-08-12T22:55:55.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"3.5.1 plots and updated readme (#1708)\n\nCo-authored-by: dePaul Miller <23461061+depaulmillz@users.noreply.github.com>","shortMessageHtmlLink":"3.5.1 plots and updated readme (#1708)"}},{"before":"2049c6c5a22bcc5c081a7c172eb4978f44602cb3","after":"7192f4ab230bb721fa8d4d3df33886dbe86cdc59","ref":"refs/heads/main","pushedAt":"2024-08-08T18:00:24.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Add CLayout_64x208 (#1680)\n\nWithout this I get compilation error when the extended shapes are enabled","shortMessageHtmlLink":"Add CLayout_64x208 (#1680)"}},{"before":"e22ba590cd8a7eebea8f53c81b5740d905021654","after":"2049c6c5a22bcc5c081a7c172eb4978f44602cb3","ref":"refs/heads/main","pushedAt":"2024-08-08T17:56:23.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"5476 cutlass 3x gemm kernels (#1695)\n\nCo-authored-by: dePaul Miller <23461061+depaulmillz@users.noreply.github.com>","shortMessageHtmlLink":"5476 cutlass 3x gemm kernels (#1695)"}},{"before":"19b4c5e065e7e5bbc8082dfc7dbd792bdac850fc","after":"e22ba590cd8a7eebea8f53c81b5740d905021654","ref":"refs/heads/main","pushedAt":"2024-08-06T15:15:18.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"support data type w2 used in cutlass_library (#1517)","shortMessageHtmlLink":"support data type w2 used in cutlass_library (#1517)"}},{"before":"06b21349bcf6ddf6a1686a47a137ad1446579db9","after":"19b4c5e065e7e5bbc8082dfc7dbd792bdac850fc","ref":"refs/heads/main","pushedAt":"2024-08-05T18:28:14.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Fix isnan namespace qualification in cutlass/functional.h (#1679)\n\n* Fix unrelated MSVC build warnings\r\n\r\n* Fix use of isnan in functional.h\r\n\r\nCorrect namespace qualification of isnan in functional.h\r\nso that it invokes cutlass::isnan for half_t, instead of\r\nconverting half_t to float and invoking std::isnan (on host,\r\nor ::isnan on device).","shortMessageHtmlLink":"Fix isnan namespace qualification in cutlass/functional.h (#1679)"}},{"before":"eee0cab26c8eedea447eb3b58b3498eeba2294da","after":"06b21349bcf6ddf6a1686a47a137ad1446579db9","ref":"refs/heads/main","pushedAt":"2024-08-01T16:20:28.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"1x1x1 cluster launch (#1673)","shortMessageHtmlLink":"1x1x1 cluster launch (#1673)"}},{"before":"36cbfcf483cc9d2ee65a55c199176ce96da1e33e","after":"eee0cab26c8eedea447eb3b58b3498eeba2294da","ref":"refs/heads/main","pushedAt":"2024-08-01T00:22:29.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Stamp out 1x1x1 clusters, 128x256 CTA shape (#1665)\n\nAdds 128x256 tile shapes to FP16/BF16 and FP8 generators.\nAlso adds 1x1x1 clusters to all existing FP16/BF16/FP8 generators.\n\nNOTE: it is important to set kernel filter (--kernels /\nCUTLASS_LIBRARY_KERNELS) to a non empty string and skip pruning to get\nall of the new configurations.\n\nIf profiling exhaustively, they can be set to `*`.\n\nNumber of CUTLASS 3.X GEMMs before this commit: 2868\nNumber of CUTLASS 3.X GEMMs after this commit: 4016\n\nCo-authored-by: Ali Hassani ","shortMessageHtmlLink":"Stamp out 1x1x1 clusters, 128x256 CTA shape (#1665)"}},{"before":"1f2b590da6dc7753ea24c5c35ab9bd2f4aa9255c","after":"36cbfcf483cc9d2ee65a55c199176ce96da1e33e","ref":"refs/heads/main","pushedAt":"2024-07-31T22:33:14.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"hwu36","name":"Haicheng Wu","path":"/hwu36","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/57973641?s=80&v=4"},"commit":{"message":"Add extended wgmma shapes for all data types (#1666)","shortMessageHtmlLink":"Add extended wgmma shapes for all data types (#1666)"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEuww2jgA","startCursor":null,"endCursor":null}},"title":"Activity · NVIDIA/cutlass"}