Hansung Kim
|
e1b0fc3944
|
generate_matrix.py: Rand [0,1); also save non-swizzled row-major B
|
2024-10-29 14:55:32 -07:00 |
|
Hansung Kim
|
34d0956cd5
|
tensor: Attempt row-major mapping for C store (WIP)
Doesn't work because 1x2 jagged mapping is required to achieve
throughput for storing the bigger C matrix (2x4, vs. 2x2 in A).
|
2024-10-02 15:14:55 -07:00 |
|
Hansung Kim
|
3490294626
|
generate_matrix.py: switch to fp16 rand, generate row-major A
|
2024-10-02 11:01:23 -07:00 |
|
Hansung Kim
|
863e92a85e
|
generate_matrix.py: Default to range, fp32
|
2024-09-07 17:40:21 -07:00 |
|
Hansung Kim
|
bde6f0ea2e
|
py: Write P_expected, don't rewrite vars
|
2024-09-05 16:23:32 -07:00 |
|
Hansung Kim
|
f7603b18d3
|
flash.py: Write V to file
|
2024-09-01 18:17:05 -07:00 |
|
Hansung Kim
|
4260bf7d6e
|
Generate S matrix, pull out FA stuff from basic script
|
2024-08-28 16:13:38 -07:00 |
|
Hansung Kim
|
60aec1de8d
|
flash.py: Fix row-wise scaling of O, col_to_save
|
2024-08-20 14:49:25 -07:00 |
|
Hansung Kim
|
2f7fb372f1
|
Fix range for
i'm a python noob
|
2024-08-19 21:19:36 -07:00 |
|
Hansung Kim
|
09afd43904
|
More flash in generate_matrix
|
2024-08-19 21:16:37 -07:00 |
|
Hansung Kim
|
351e17c849
|
Separate golden script for flashattn
|
2024-08-19 21:16:13 -07:00 |
|
Hansung Kim
|
3f4abc542c
|
tensor: Fix dimensions and makefile
|
2024-08-19 17:37:26 -07:00 |
|
Hansung Kim
|
fd2ff6208d
|
Generate golden data for flash in generate_matrix.py
|
2024-08-15 17:41:57 -07:00 |
|
Hansung Kim
|
95e3e96c6c
|
tensor: Change B in-memory layout to column-major
|
2024-08-12 15:22:07 -07:00 |
|
Hansung Kim
|
07dd9e35a0
|
tensor: Fix dimensions for fp16 in script
|
2024-08-12 15:22:07 -07:00 |
|
Hansung Kim
|
c1906ebb4f
|
tensor: Embed binary instead of hardcoding literals
the C compiler doesn't support fp16
|
2024-08-12 15:22:07 -07:00 |
|
Hansung Kim
|
1b5daccac9
|
tensor: Generate fp16-packed matrix in script
|
2024-08-12 15:22:07 -07:00 |
|
Hansung Kim
|
a12f2c296c
|
tensor: Update readme
|
2024-07-31 11:55:28 -07:00 |
|
Hansung Kim
|
446b1a4c2e
|
tensor: Add readme
|
2024-07-31 11:53:31 -07:00 |
|
Hansung Kim
|
285776404f
|
tensor: Fix tensor unittest kernel
|
2024-07-31 11:49:41 -07:00 |
|
Hansung Kim
|
29f7290948
|
tensor: Fix correctness script
|
2024-07-31 11:39:50 -07:00 |
|
Hansung Kim
|
800d9801b5
|
tensor: Test with multiple accumulators
|
2024-06-07 18:19:20 -07:00 |
|
Hansung Kim
|
2cac995db9
|
tensor: generate 8x8 in correctness script
|
2024-06-07 18:13:57 -07:00 |
|
Hansung Kim
|
483f975439
|
Merge branch 'kernels' into tensor_core
|
2024-06-07 16:27:01 -07:00 |
|
Hansung Kim
|
c08a4cba8b
|
Add -ffixed-regs to tests/kernel makefile
|
2024-05-26 13:56:34 -07:00 |
|
Hansung Kim
|
0a884e1ead
|
tensor: spawn on all warps, 8 lanes
|
2024-05-25 20:19:57 -07:00 |
|
Hansung Kim
|
8a521a1de8
|
Add 8-lane operand mapping
|
2024-05-10 23:23:11 -07:00 |
|
Richard Yan
|
33066af56e
|
cisc gemmini
|
2024-05-08 15:46:20 -07:00 |
|
Hansung Kim
|
6ba6a1e2e5
|
Merge branch 'kernels' into tensor_core
|
2024-05-08 13:25:31 -07:00 |
|
Hansung Kim
|
5821bfd10d
|
Repeat vx_wmma issue & hardcode dst address
|
2024-05-08 13:22:26 -07:00 |
|
Hansung Kim
|
b4c812f9f8
|
Write expected_C to a binary file
|
2024-05-05 18:27:56 -07:00 |
|
joshua
|
5bd25985c6
|
i kinda forgot most of changes
|
2024-05-04 23:01:47 -07:00 |
|
Richard Yan
|
1b6ebf86a1
|
update gemmini kernels
|
2024-05-02 15:16:55 -07:00 |
|
Richard Yan
|
041d49fb58
|
update gemmini only kernel
|
2024-04-15 10:22:00 -07:00 |
|
Richard Yan
|
7bf72c9568
|
cycle counting for fence
|
2024-04-09 19:53:17 -07:00 |
|
Richard Yan
|
84a31f3384
|
thread parallel data loading for word strided bank
|
2024-04-01 11:10:32 -07:00 |
|
joshua
|
d8f9359fae
|
test case update
|
2024-03-28 13:04:02 -07:00 |
|
joshua
|
e16584ddd9
|
bleh still not work
|
2024-03-27 00:26:04 -07:00 |
|
Richard Yan
|
b88dbd7a83
|
add cycle count and multi core support
|
2024-03-26 16:43:49 -07:00 |
|
Richard Yan
|
c18267443f
|
matmul kernel switch to proper fence and fsm
|
2024-03-20 15:22:25 -07:00 |
|
Richard Yan
|
94ad1850a9
|
implement correct gemmini fence and loop fsm support
|
2024-03-20 15:18:31 -07:00 |
|
joshua
|
beb3dce46d
|
integer reduction unit
|
2024-03-06 01:39:17 -08:00 |
|
Richard Yan
|
5b1c527186
|
Merge branch 'kernels' of https://github.com/hansungk/vortex into kernels
|
2024-02-24 00:27:23 -08:00 |
|
Richard Yan
|
914864206a
|
MMIO gemmini matmul kernel
|
2024-02-24 00:27:16 -08:00 |
|
Hansung Kim
|
a43d5eb1a7
|
Merge remote-tracking branch 'upstream/master' into kernels
|
2024-02-12 20:50:32 -08:00 |
|
Richard Yan
|
12bdab8043
|
update gemmini matmul kernel
|
2024-02-08 17:00:19 -08:00 |
|
Hansung Kim
|
b5bfa7d4b9
|
Fix bogus spad address
|
2024-02-01 14:05:13 -08:00 |
|
Hansung Kim
|
0462a91953
|
Update mmio kernel to do single gemm
|
2024-02-01 13:52:29 -08:00 |
|
Hansung Kim
|
7f6f1d605f
|
Add bare mmio kernel
|
2024-01-24 16:24:19 -08:00 |
|
Blaise Tine
|
62cdd8e993
|
minor update
|
2023-11-11 15:49:39 -08:00 |
|